W2_A1_Video Lecture on Derivatives

Hi everyone! I just watched the lectures on derivatives. The calculus part is quite understandable to me, but the part that I don’t understand is why the da/dz = a(1-a). Can someone explain that?

This expression da/dz =a(1-a) of the chain rule represents the partial derivative of the activation function ‘a’ w.r.t weighted input ‘z’. This is one of the core step used during backpropagation where weights get updated while training neural network models.

The complete expression to find the value of da/dz using chain rule is:

  • da/dz = da/d(sigmoid(z)) * d(sigmoid(z))/dz

  • da/dz = a(1-a) * exp(-z) / (1 + exp(-z))^2

With simplifiction, we can have the final value as:

  • da/dz = a(1-a)
Right! That is just the derivative of sigmoid. Here’s a StackExchange thread that also gives the derivation. You can find others by googling “derivative of sigmoid”.

