Hi everyone,
if sigmoid function is 1/(1+exp(-z), and if its derivative is sigmoid(1-sigmoid),
then why in ŷ = σ(w1x1 + w2 x2 + b), ∂ ŷ /∂w1 becomes:
ŷ ( 1 - ŷ) x1
in page 397 of third week slides, Classification with Perceptron - Calculating the derivatives video.
Hi @Admiral1994,
To make this easier to see, let’s set z = w_1x_1 + w_2x_2 + b, so that ŷ = σ(z).
To calculate ∂ŷ / ∂w1, we’re taking the partial derivative of ŷ with respect to w1, meaning all other variables are treated as constants except for the w1.
Using chain rule, we would have
\frac{\partial \hat{y}}{\partial w_1} = \frac{\partial \hat{y}}{\partial z} * \frac{\partial z}{\partial w_1}
Calculating each part on the right hand side, we get
\frac{\partial \hat{y}}{\partial z} = \sigma(z) * (1 - \sigma(z))
\frac{\partial z}{\partial w_1} = \frac{\partial}{\partial w_1}(w_1x_1 + w_2x_2 + b) = x_1
So we have:
\frac{\partial \hat{y}}{\partial w_1} = \sigma(z) * (1 - \sigma(z)) * x_1
Since we set ŷ = σ(z), we can substitute it back in, giving us:
\frac{\partial \hat{y}}{\partial w_1} = \hat{y} * (1 - \hat{y}) * x_1
I hope this helps!
4 Likes
thanks a lot, that was really helpful. appreciate it.
1 Like
Glad I could help!