Vanishing/Exploding Gradients

mahmoud_ahc · June 22, 2021, 11:39am

Excuse me! According to this function w[l]=np.random.randn(shape) * np.sqrt(2/n[l-1], how does this help in vanishing or exploding Gradients? It wasn’t clear enough to me. . Also, does changing the square root formula in accordance with the activation function affects the final output that much?

Another point, what is epsilon in numerical approximation of Gradients? how does this even help regarding the issue of vanishing/exploding gradients and the correct implementation of back propagation?

Thanks in advance

mahmoud_ahc · June 22, 2021, 1:02pm

For gradient checking, I found this really helpful.

Topic		Replies	Views
Vanishing / Exploding Gradients Improving Deep Neural Networks: Hyperparameter tun week-1	5	475	January 11, 2024
[Help]Vanishing/Exploding Solution Improving Deep Neural Networks: Hyperparameter tun	1	514	April 25, 2022
Vanishing_exploding gradients Improving Deep Neural Networks: Hyperparameter tun	1	528	September 12, 2022
Vanishing/Exploding Activations Improving Deep Neural Networks: Hyperparameter tun	3	574	October 31, 2021
The problem of expolding/vanishing Improving Deep Neural Networks: Hyperparameter tun	2	510	March 5, 2022

Vanishing/Exploding Gradients

Related topics