In the video: Logistic Regression Gradient Descent.
I want to know why the derivative of L with respect to w1 equals to x1dz .
And I also want to know the dz in this equation represents the python code dz or the derivative symbols.
Thanks in advance.
1 Like
Yes, in Prof Ng’s notation the dz means a partial derivative of L w.r.t. z , so this is just the Chain Rule in action.
Remember that the formulas are:
L(y, a) = -y log(a) - (1 - y) log(1 - a)
a = \sigma(z)
z = \displaystyle \sum_{i = 1}^{n_x} w_i x_i + b
And then we have the full Chain Rule expression:
\displaystyle \frac {\partial L}{\partial w_1} = \frac {\partial L}{\partial a}\frac {\partial a}{\partial z}\frac {\partial z}{\partial w_1}
Of course you can see from the above equations that:
\displaystyle \frac {\partial z}{\partial w_1} = x_1
And we can rewrite the full Chain Rule expression with this simplification:
dz = \displaystyle \frac {\partial L}{\partial z} = \displaystyle \frac {\partial L}{\partial a} \frac {\partial a}{\partial z}
3 Likes
@paulinpaloalto I think I’m missing something in your explanation.
How is dz/dw1 equals x1
Rashmi
August 3, 2022, 11:07am
4
Hi, Precious1.
It’s just a mathematical expression of what Paul sir mentioned in his above explanation.
if you multiply dW1 with x1, then you will get dz. Right !
It’s from the given formula:
It’s a partial derivative.
\begin{aligned}
z &= \sum_{i=1}^{n_x}w_i x_i + b \\
&= w_1x_1 + w_2x_2 + w_3x_3 + ... + b\\
\end{aligned}
Then, what we want to have is a partial derivative with respect w_1 . So, it should be;
\frac{\partial z}{\partial w_1} = x_1 + 0 + 0 + ... + 0 = x_1
1 Like