ResNet confusion - Course 4 - Week 2

Marios_Constantinou · April 28, 2022, 3:38pm

I am a bit confused on what’s going on with ResNets. We have the main path and the shortcut as seen on the slide. The general flow of the Neural Network is the main path, with the main difference is that we copy a[l] and place it right before the next ReLu function?

So we calculate z[l+2] and when we apply the ReLu we have g(z[l+2] + a[l])? What about the calculations in the middle? Do we just copy a[l] and at the same time the flow of the NN continues as normal?

paulinpaloalto · April 28, 2022, 3:56pm

I’m not sure whether it’s correct to think of the “general flow” as being the main path. The point of this architecture is that there are two parallel paths, both of which contribute (in perhaps different ways) to the results. The point is that the input to the last ReLU you show is the sum of the outputs from the two paths. And note that backward propagation will happen through both paths as well: as always, it’s the mirror image of what happens in forward propagation. So that the gradients at the point where the short cut branches off will be the sum of the gradients from the two separate paths.

With the above thoughts in mind, you should go back and listen again to what Prof Ng says about why this is a helpful and interesting approach.

Marios_Constantinou · April 28, 2022, 3:58pm

That was really helpful, the “parallel” word clicked, and I think I understood it. Thank you!

paulinpaloalto · April 28, 2022, 5:34pm

Great! But I really recommend that you listen again to what Prof Ng actually says with that extra idea in mind. He is a really excellent teacher and I’m sure he explained this in the lectures way better than I just did.

Marios_Constantinou · April 28, 2022, 5:45pm

Oh for sure, I always rewatch the lectures 2 or even 3 times haha. He really is an excellent teacher. Once i finished the ML course i came straight to this one and I couldn’t be happier!

Topic		Replies	Views
How does ResNets work? Convolutional Neural Networks coursera-platform	2	627	June 13, 2022
Meaning of 'short cut' in ResNet Convolutional Neural Networks coursera-platform	5	538	August 4, 2022
Course 4 week 2, quiz: my answer is correct (I think) but was graded incorrect Convolutional Neural Networks coursera-platform	3	582	June 4, 2022
In what way a residual block creates a shortcut[ResNet]? Convolutional Neural Networks coursera-platform	4	528	February 5, 2023
Course 4, week 2 : why resnets work (relu activation function and output) Convolutional Neural Networks coursera-platform	1	502	February 24, 2022

ResNet confusion - Course 4 - Week 2

Related topics