Why do we write:

dZ2 = np.multiply(dA2, np.int64(A2 > 0))

Wasn’t dA2 supposed to be multiplied with the derivative of activation_function(Z2)? What is the significance of np.int64(A2 > 0)?

It is the derivative of the activation function, which is ReLU is this case, right? Think about it for a sec and it should make sense. Sure, they could have written it as *np.int64(Z2 > 0)* and maybe that would have been more obvious, but the result is the same, right?

Thank you, it made my doubt clear. One more thing: in *np.int64(A2 > 0)*, why do we write A2>0, shouldn’t it be just np.int64(A2)?

Sorry, that wouldn’t work. What is the derivative of ReLU? It’s 0 for inputs <= 0 and it’s 1 for inputs > 0, right? That’s what the expression A2 > 0 or Z2 > 0 gives you, but with the datatype Boolean. Then you convert that to a numeric value. Actually I’d think it would make more sense to convert it to a float, rather than an integer, but the type coercion rules make integer work just as well.

*ReLU(4.3) == 4.3*, right? So *int64* of that is 4, not 1.