For Course1 Week 3 Assignment,
From Andrew’s notes, the fourth equation for dz1 is 𝑔[1]′(𝑍[1])
But the codes is using A1 instead of Z1.
I am a bit confused. The codes went through is as below
dZ1 = np.dot(W2.T,dZ2)*(1-np.power(A1,2))#Should it be Z1???
Should it be np.power(Z1,2)?
Thanks.
The formulas as written are correct. The point is that this is a property of the derivative of tanh. The first expression is fully general with just the derivative of whatever the activation function is and the input to that function is Z1, right? But we have this relationship:
A1 = tanh(Z1)
tanh'(Z1) = 1 - tanh^2(Z1) = 1 - A1^{2}
That latter is not the same thing as 1 - Z1^2, right?
1 Like