How did we come to the conclusion that dw1 = x1dz, dw2 = x2dz and db = dz. This is mentioned in Week 2 content for gradient descent?
3 Likes
Hi @rohan.j,
The derivation is left as an exercise for the math interested student. You can try for yourself, by doing backprop on the following network:
2 Likes
@jonaslalin , thank you for your reply. I somehow happened to miss out the backpropagation/derivative of z wrt w1 which results in x1 multiplied by dZ which is similar for w2 and b as well.
2 Likes
Yes Using the chain rule:
dL/dw1 = dL/dz * dz/dw1 = dL/dz * x1
5 Likes
Some how the expression dw1 = x1 * dz looks incorrect.
since z = w1x1 + w2x2 + b
then dz/dw1 = x1
=> dz = x1 dw1 and not the other way around.
Please help me correct my understanding
In this course, dL/dw1 denotes as dw1.