It was all covered in this lecture and this lecture. The formulas are also given in the notebook. You compute the gradients in the `backward_propagation`

routine and then they are just passed to `update_parameters`

in the `grads`

dictionary. Please see the section titled:

### Exercise 6 - backward_propagation

In the Week 3 assignment. Note that Prof Ng does not use the notation \theta for the parameters in this course. He also tries to avoid using the mathematical notation for partial derivatives. He has invented his own notation in which he uses these shorthands for the gradient values:

dW^{[l]} = \displaystyle \frac {\partial J}{\partial W^{[l]}}

db^{[l]} = \displaystyle \frac {\partial J}{\partial b^{[l]}}

So using his notation, the formula for updating, say, W^{[l]} is:

W^{[l]} = W^{[l]} - \alpha * dW^{[l]}

With that in mind, please read the back propagation section of the notebook again and it should all make sense.

Note that he does not derive these formulas, though: he simply presents them. These courses are specifically designed not to require knowledge of even univariate calculus, let alone matrix calculus. If you have the math background, hereโs a thread with pointers to derivations available on the web.