How to apply dropout concept in L layers and
what is role of dZ1 = np.multiply(dA1, np.int64(A1 > 0)) statement in Dropout assignment?
As Prof Ng explains in the lectures and as it is also explained in the notebook, dropout is implemented at an individual layer of the neural network. It is your design choice to decide at which layers you apply dropout. In the specific example in the assignment, the network has 3 layers and they have us apply dropout to the two “hidden” layers of the network, but not the output layer.
That dZ1 code you show is simply the implementation of this general formula in the case that the activation function at that layer is ReLU:
dZ^{[l]} = dA^{[l]} * g^{[l]'}(Z^{[l]})
Note that you have to be a bit careful with the np.int64(A1 > 0) expression. Z1 is not handily available, so they use A1. If you were using Z1, you could say either Z1 > 0 or Z1 >= 0 and get essentially the same result. But if you are using A1, >= will fail utterly, since A1 is the output of ReLU, which means it is always non-negative, right?
Note that for convenience and simplicity of the code, they have simply hard-coded the layers here, rather than implementing the fully general L layer code as we built it in C1 W4 with all the layers of subroutines like linear_activation_forward and linear_activation_backward and all that mechanism.