Vanishing/Exploding gradients C2W1

Christian_Simonis · January 2, 2023, 6:35am

Hi there

Vanishing gradients occur when the gradients of the parameters of a DNN become so small, that the model learns only very slowly and it seems „nothing“ is happening.

Exploding gradients is describing the opposite situation when the gradients are getting super large, causing e.g. numerical issues.

You can mitigate e.g. w/ the use of activation functions like ReLU, see also this thread:

Activation functions - #2 by Christian_Simonis

Further best practices for mitigation include weight initialisation, weight decay and batch normalization to stabilise the activation. It’s also possible to clip the weights w/ bounded optimization or reduce the learning rate if you see gradients exploding. It makes also sense to monitor your gradient flow, see also this thread!

If you want to read more also with respect to additional mitigation techniques, feel free to take a look at this Source.

Best
Christian

Topic		Replies	Views
What causing exploding gradients? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	591	April 21, 2022
So, what is vanishing/exploding gradient? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	791	August 19, 2023
Vanishing/Exploding Activations Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	574	October 31, 2021
Vanishing_exploding gradients Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	528	September 12, 2022
Exploding gradients in deep neural network Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	345	September 26, 2023

Vanishing/Exploding gradients C2W1

Related topics