Hello, please explain to me the following code in backward_propagation_with_regularization function: `dZ = np.multiply(dA, np.int64(A > 0))`

Why are we multiplying `dA`

on `np.int64(A > 0)`

?

Everything in this exercise looks different than the fully general L layer code we built in C1 Week 4, right? They’ve just hard-coded everything to a specific 3 layer network to keep the code simple. No layers of functions like *linear_activation_backward* calling *relu_backward* and *linear_backward*.

The general formula being implemented there is:

dZ^{[l]} = dA^{[l]} * g^{[l]'}(Z^{[l]})

But in the specific case that the activation function is ReLU. Think about it for a sec and the light should go on!

You could legitimately observe that it would be more literally correct if they had written *np.int64(Z > 0)*, but if A = ReLU(Z), then A > 0 iff Z > 0, right? I were writing the code, I would have written it as *np.float64(A > 0)*, but that’s just me. I prefer not to assume python’s type coercion is going to do exactly what I expect.

Thank you for your explanation! The derivative of ReLU is 1 or 0 and `np.int`

or `np.float`

from bool values just converts them to 1 or 0