Can someone please explain how the derivative of a with respect to z is calculated to a(1-a) in the video Logistic Regression Gradient Descent (Week 2)
Welcome to the community!
The Derivative of the sigmoid function[last layer in this example] is = a(1-a) that’s because
sigmoid \ function\\ \ \sigma = \frac{1}{1+e^{-z}}
\ so \ that \ the \\\ \frac{da}{dz} = \frac{da}{dz}(1+e^{-z})^{-1}
\\ =-1*(1+e^{z})^{-2} *(-e^{-z}) \\= \frac{e^{-z}}{(1+e^{-z})^{2}} \\= \frac{1}{1+e^{-z}} * \frac{e^{-z}}{1+e^{-z}} \\= (\frac{1}{1+e^{-z}} ) * (\frac{1+e^{-z}}{1+e^{-z}}- \frac{1}{1+e^{-z}} ) \\= (\frac{1}{1+e^{-z}} ) * (1- \frac{1}{1+e^{-z}} ) \\= a(1-a)
But if the last layer isn’t Sigmoid we should change this equations to calculate the appropriate equations according to the activation function of the last layer it has different equations
Cheers,
Abdelrahman
2 Likes
Thank you for the explanation.