Sorry, I don’t get it.
“grads” is type <class ‘dict’>, and grads[“dW1”], grads[“db1”], grads[“dW2”], grads[“db2”] are all <class ‘numpy.ndarray’> in my output too. I think that’s OK.
Even the shape I think is OK: dW1 (4, 2), db1 (4, 1), dW2 (1, 4), db2 (1, 1). The same as the expected output.
The only difference I can see between my output and the expected output is the value of dW1[1][0], dW1[1][1] and db1[1], but these differences are less than [0.00000001].
I can’t see what’s wrong?
To compute dZ1 you’ll need to compute 𝑔[1]′(𝑍[1]). Since 𝑔1 is the tanh activation function, if 𝑎=𝑔1 then 𝑔[1]′(𝑧)=1−𝑎2. So you can compute 𝑔[1]′(𝑍[1]) using (1 - np.power(A1, 2))
I also got that kind of error and solved it by reading this more carefully, but I don’t actually understand it. Why is it np.power(A1, 2) and not np.power(Z1, 2)?
Thanks in advance