Dear Mentor,
We cannot understand the intuition behind how when the slope is close to zero, the gradient descent becomes slow down? Usually when the slope is zero means gradient descent converge to global minimum right. Then how gradient descent becomes slow down when the slope of the function is close to zero ? Can u please help to understand this ?
Now, one of the downsides of both the sigmoid function and the tanh function is that if z is either very large or very small,then the gradient or the derivative or the slope of this function becomes very small.So if z is very large or z is very small,
the slope of the function ends up being close to 0.And so this can slow down gradient descent.