Hi everyone,

if sigmoid function is 1/(1+exp(-z), and if its derivative is sigmoid(1-sigmoid),

then why in ŷ = σ(w1x1 + w2 x2 + b), ∂ ŷ /∂w1 becomes:

ŷ ( 1 - ŷ) x1

in page 397 of third week slides, Classification with Perceptron - Calculating the derivatives video.

Hi @Admiral1994,

To make this easier to see, let’s set z = w_1x_1 + w_2x_2 + b, so that ŷ = σ(z).

To calculate ∂ŷ / ∂w1, we’re taking the partial derivative of ŷ with respect to w1, meaning all other variables are treated as constants except for the w1.

Using chain rule, we would have

\frac{\partial \hat{y}}{\partial w_1} = \frac{\partial \hat{y}}{\partial z} * \frac{\partial z}{\partial w_1}

Calculating each part on the right hand side, we get

\frac{\partial \hat{y}}{\partial z} = \sigma(z) * (1 - \sigma(z))

\frac{\partial z}{\partial w_1} = \frac{\partial}{\partial w_1}(w_1x_1 + w_2x_2 + b) = x_1

So we have:

\frac{\partial \hat{y}}{\partial w_1} = \sigma(z) * (1 - \sigma(z)) * x_1

Since we set ŷ = σ(z), we can substitute it back in, giving us:

\frac{\partial \hat{y}}{\partial w_1} = \hat{y} * (1 - \hat{y}) * x_1

I hope this helps!

4 Likes

thanks a lot, that was really helpful. appreciate it.

1 Like

Glad I could help!