C2 W1 / Regularization / Exercise 2

Hello, please explain to me the following code in backward_propagation_with_regularization function: dZ = np.multiply(dA, np.int64(A > 0))
Why are we multiplying dA on np.int64(A > 0)?

Everything in this exercise looks different than the fully general L layer code we built in C1 Week 4, right? They’ve just hard-coded everything to a specific 3 layer network to keep the code simple. No layers of functions like linear_activation_backward calling relu_backward and linear_backward.

The general formula being implemented there is:

dZ^{[l]} = dA^{[l]} * g^{[l]'}(Z^{[l]})

But in the specific case that the activation function is ReLU. Think about it for a sec and the light should go on! :bulb: :nerd_face:

You could legitimately observe that it would be more literally correct if they had written np.int64(Z > 0), but if A = ReLU(Z), then A > 0 iff Z > 0, right? I were writing the code, I would have written it as np.float64(A > 0), but that’s just me. I prefer not to assume python’s type coercion is going to do exactly what I expect.

Thank you for your explanation! The derivative of ReLU is 1 or 0 and np.int or np.float from bool values just converts them to 1 or 0 :+1: