Iโm having trouble with the programming assignment โPlanar_data_classification_with_one_hidden_layerโ, Exercise 7 - update_parameters. I need to implement the function for the equation: ๐ = ๐ โ ๐ผ(โ๐ฝ/โ๐)
but I canโt figure out how to compute the values for โ๐ฝ and โ๐. I tried scanning back over the lecture notes and playing the videos, but I canโt locate where this equation was explained. It would be nice if the instructions in the assignment would include some kind of reminder.
What is the equation for these, or in which lecture was this covered? Thanks.
It was all covered in this lecture and this lecture. The formulas are also given in the notebook. You compute the gradients in the backward_propagation
routine and then they are just passed to update_parameters
in the grads
dictionary. Please see the section titled:
Exercise 6 - backward_propagation
In the Week 3 assignment. Note that Prof Ng does not use the notation \theta for the parameters in this course. He also tries to avoid using the mathematical notation for partial derivatives. He has invented his own notation in which he uses these shorthands for the gradient values:
dW^{[l]} = \displaystyle \frac {\partial J}{\partial W^{[l]}}
db^{[l]} = \displaystyle \frac {\partial J}{\partial b^{[l]}}
So using his notation, the formula for updating, say, W^{[l]} is:
W^{[l]} = W^{[l]} - \alpha * dW^{[l]}
With that in mind, please read the back propagation section of the notebook again and it should all make sense.
Note that he does not derive these formulas, though: he simply presents them. These courses are specifically designed not to require knowledge of even univariate calculus, let alone matrix calculus. If you have the math background, hereโs a thread with pointers to derivations available on the web.