Typo in back prop formula (week3 and week 4)

OK, I must be making some kind of blunder here.

What I proposed is to define all the partial derivatives with respect to J, instead of what you called L, which is an m-dimensional vector, when we are differentiating with respect to Z's. The point is that \frac{\partial L}{\partial Z} is of dimension m*\dim Z, which messes things up when we use them for calculations for dW and db's. It will also make things a lot more consistent.

In this case,

\frac{\partial J}{\partial Z^{[L]}}= \frac{1}{m}(A^{[L]}-Y),
\vdots
\frac{\partial J}{\partial Z^{[l]}} = W^{[l]^T}\frac{\partial J}{\partial Z^{[l+1]}}*g'^{[l-1]}(Z^{[l-1]})
\vdots
\frac{\partial J}{\partial W^{[l]}}= \frac{\partial J}{\partial Z^{[l]}} A^{[l-1]^T}.

(no 1/m in the last line.)

In fact, I used this and passed in assignment 3, where we are not graded upon the desired value of dZ's. What I wrote still give us the correct values of dW's and db's.

1 Like