Hi everyone! I just watched the lectures on derivatives. The calculus part is quite understandable to me, but the part that I don’t understand is why the da/dz = a(1-a). Can someone explain that?
Hello Ashot Melkonyan,
Welcome to the community and thank you for your question.
This expression da/dz =a(1-a) of the chain rule represents the partial derivative of the activation function ‘a’ w.r.t weighted input ‘z’. This is one of the core step used during backpropagation where weights get updated while training neural network models.
The complete expression to find the value of da/dz using chain rule is:
-
da/dz = da/d(sigmoid(z)) * d(sigmoid(z))/dz
-
da/dz = a(1-a) * exp(-z) / (1 + exp(-z))^2
With simplifiction, we can have the final value as:
da/dz = a(1-a)
1 Like
Right! That is just the derivative of sigmoid. Here’s a StackExchange thread that also gives the derivation. You can find others by googling “derivative of sigmoid”.
1 Like