Can someone please explain how the derivative of a with respect to z is calculated to a(1-a) in the video Logistic Regression Gradient Descent (Week 2)

Welcome to the community!

The Derivative of the sigmoid function[last layer in this example] is = a(1-a) that’s because

sigmoid \ function\\ \ \sigma = \frac{1}{1+e^{-z}}
\ so \ that \ the \\\ \frac{da}{dz} = \frac{da}{dz}(1+e^{-z})^{-1}
\\ =-1*(1+e^{z})^{-2} *(-e^{-z}) \\= \frac{e^{-z}}{(1+e^{-z})^{2}} \\= \frac{1}{1+e^{-z}} * \frac{e^{-z}}{1+e^{-z}} \\= (\frac{1}{1+e^{-z}} ) * (\frac{1+e^{-z}}{1+e^{-z}}- \frac{1}{1+e^{-z}} ) \\= (\frac{1}{1+e^{-z}} ) * (1- \frac{1}{1+e^{-z}} ) \\= a(1-a)

**But** if the last layer **isn’t Sigmoid** we should change this equations to calculate the appropriate equations according to the activation function of the last layer it has different equations

Cheers,

Abdelrahman

2 Likes

Thank you for the explanation.