What causing exploding gradients?

Max_Rivera · February 2, 2022, 3:56am

Vanishing gradients can occur when the derivatives of activations are small. For at least functions like sigmoid and tanh, the derivatives are small for large values of |z|. But what weights / activation functions will cause exploding gradients? The derivatives of common activation functions like sigmoid, tanh, and ReLU never exceed 1. So I don’t see how it’s possible to achieve exploding gradients regardless of what values of |z| we have.

nramon · April 21, 2022, 10:15am

Hi, @Max_Rivera:

When computing the gradients, the derivatives of the activation functions are not the only terms that come up. Let me know if this explanations helps (check the links too).

Good luck with the specialization

Topic		Replies	Views
Sigmoid and tanh suffers only with vanishing gradients problem? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	798	September 11, 2023
The problem of expolding/vanishing Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	510	March 5, 2022
Vanishing/Exploding Gradients when there is a non-linear activation function Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	647	January 13, 2023
Vanishing/Exploding Activations Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	574	October 31, 2021
Vanishing/Exploding gradients C2W1 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	641	January 2, 2023

What causing exploding gradients?

Related topics