Residual Connection - Exploding Gradients

bardh · May 31, 2021, 9:01am

Hey there!

In the lecture Why Residual Nets work?, Professor Ng mentioned that the vanishing and exploding gradients and how this residual connection makes the solution to alleviate these two problems.

As I understand how adding vanishing gradients (a[l+2]) with the previous layer (a[l]) can solve the problem for vanishing gradients - because the gradients of the previous layer (a[l]) are still non-zero and help the model learn.

I am having a hard time understanding how adding residual connections help with exploding gradients. Adding those two matrices will simply output an output matrix with really large values.

I would appreciate any help

Cheers!

reinoudbosch · June 4, 2021, 2:20am

Hi bardh,

You can have a look here. The idea is that skip connections simplify the network mathematically, and that in result the exploding gradient problem is circumvented.

Topic		Replies	Views
Quiz week 2, question Q5 Convolutional Neural Networks	2	528	November 7, 2021
C4 W2 "Why ResNets Work?" Question about the insight Convolutional Neural Networks	1	516	May 27, 2022
Questions about the Residual Networks Convolutional Neural Networks	1	586	June 23, 2021
Week2 Assignment1 Contradiction with ResNet Paper Convolutional Neural Networks	1	514	May 17, 2022
C4W2 Residual network vs plain network Convolutional Neural Networks	1	515	November 2, 2021

Residual Connection - Exploding Gradients

Related topics