I don't know the difference between dZL = AL - Y and dZL = dAL .* g'(ZL)

Please teach me!!!
In week3 neural nets with one hidden layer, I learned dZ2 = A2 - Y. This is a output layer.
But in week4, I learned in output layer we calculate dZL = dAL .* gā€™(ZL).
Why has it changed ?

The second formula is the general case of that calculation that works at any layer. The first version is what you get if you apply the second formula to the specific case of the output layer because of derivatives of the cross entropy loss function and the sigmoid activation function. That derivation is shown in this popular thread from Eddy.

The better way to write them would be to make clear that the layer number in the second formula is not just L for the last layer:

dZ^{[l]} = dA^{[l]} * g^{[l]'}(Z^{[l]})
dZ^{[L]} = A^{[L]} - Y

2 Likes

Okay! I got it!!! Thank you!!!