Week2 Assignment1 Contradiction with ResNet Paper

saiman · September 8, 2021, 6:07am

Hi all

the first paragraphs of the assignment is in contradiction with the ResNet Paper.

“We argue that this optimization difficulty is unlikely to
be caused by vanishing gradients. These plain networks are
trained with BN (Batch Normalization), which ensures forward propagated
signals to have non-zero variances. We also verify that the
backward propagated gradients exhibit healthy norms with
BN. So neither forward nor backward signals vanish.”

So if the problem is vanishing/exploding Gradients, it would be more logical to use better initialization methods(As Prof. Andrew mentions ) and also Batch normalization.

But the problem to which ResNet is a solution is something else.

So in this view, the first paragraphs of the assignment are a bit in contradiction with the source papers.

Thanks alot

Elemento · May 17, 2022, 12:47pm

Hey @saiman,
Apologies for the delayed response. It indeed is a great question. I have solved the assignment and read the paper myself, but didn’t connect the points that you have in your query. It even made me think for a considerable portion of time, before I stumped upon this great thread on StackExchange. The question in this thread is different from what you have asked, but the answer is very well aligned with your query. I hope this helps.

Regards,
Elemento

Topic		Replies	Views
C4 W2 "Why ResNets Work?" Question about the insight Convolutional Neural Networks	1	516	May 27, 2022
C4W2 Residual network vs plain network Convolutional Neural Networks	1	515	November 2, 2021
Residual Connection - Exploding Gradients Convolutional Neural Networks	1	608	June 4, 2021
Simplifying ResNets after training Convolutional Neural Networks	2	486	November 16, 2022
W2 Test Question 2 Convolutional Neural Networks	1	509	September 28, 2021

Week2 Assignment1 Contradiction with ResNet Paper

Related topics