I don't know the difference between dZL = AL - Y and dZL = dAL .* g'(ZL)

seiya · February 8, 2022, 2:18pm

Please teach me!!!
In week3 neural nets with one hidden layer, I learned dZ2 = A2 - Y. This is a output layer.
But in week4, I learned in output layer we calculate dZL = dAL .* g’(ZL).
Why has it changed ?

paulinpaloalto · February 8, 2022, 4:50pm

The second formula is the general case of that calculation that works at any layer. The first version is what you get if you apply the second formula to the specific case of the output layer because of derivatives of the cross entropy loss function and the sigmoid activation function. That derivation is shown in this popular thread from Eddy.

The better way to write them would be to make clear that the layer number in the second formula is not just L for the last layer:

dZ^{[l]} = dA^{[l]} * g^{[l]'}(Z^{[l]})
dZ^{[L]} = A^{[L]} - Y

seiya · February 8, 2022, 6:57pm

Okay! I got it!!! Thank you!!!

Topic		Replies	Views
Confusion about Calculating dZ^[l] Neural Networks and Deep Learning coursera-platform	3	809	October 26, 2022
W4_QUIZ_dZL = AL - Y Neural Networks and Deep Learning coursera-platform	4	693	January 26, 2023
dZ[1] derivation Neural Networks and Deep Learning coursera-platform	1	719	November 4, 2021
Week4- assignment 2- Difference in gradient calculation for the last layer activation in neural networks Neural Networks and Deep Learning coursera-platform	2	677	May 17, 2023
Confused at how the formula of dz in hidden layer was deducted Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	615	April 4, 2022

I don't know the difference between dZL = AL - Y and dZL = dAL .* g'(ZL)

Related topics