Week2 why dw1=x1dz

In the video: Logistic Regression Gradient Descent.
螢幕擷取畫面 2022-03-06 190103
I want to know why the derivative of L with respect to w1 equals to x1dz .
And I also want to know the dz in this equation represents the python code dz or the derivative symbols.

Thanks in advance.

1 Like

Yes, in Prof Ng’s notation the dz means a partial derivative of L w.r.t. z, so this is just the Chain Rule in action.

Remember that the formulas are:

L(y, a) = -y log(a) - (1 - y) log(1 - a)
a = \sigma(z)
z = \displaystyle \sum_{i = 1}^{n_x} w_i x_i + b

And then we have the full Chain Rule expression:

\displaystyle \frac {\partial L}{\partial w_1} = \frac {\partial L}{\partial a}\frac {\partial a}{\partial z}\frac {\partial z}{\partial w_1}

Of course you can see from the above equations that:

\displaystyle \frac {\partial z}{\partial w_1} = x_1

And we can rewrite the full Chain Rule expression with this simplification:

dz = \displaystyle \frac {\partial L}{\partial z} = \displaystyle \frac {\partial L}{\partial a} \frac {\partial a}{\partial z}

2 Likes

@paulinpaloalto I think I’m missing something in your explanation.
How is dz/dw1 equals x1

Hi, Precious1.

It’s just a mathematical expression of what Paul sir mentioned in his above explanation.

if you multiply dW1 with x1, then you will get dz. Right ! :slight_smile:
It’s from the given formula:
Capture1

It’s a partial derivative.

\begin{aligned} z &= \sum_{i=1}^{n_x}w_i x_i + b \\ &= w_1x_1 + w_2x_2 + w_3x_3 + ... + b\\ \end{aligned}

Then, what we want to have is a partial derivative with respect w_1. So, it should be;

\frac{\partial z}{\partial w_1} = x_1 + 0 + 0 + ... + 0 = x_1
1 Like