The need for deeper networks and the ResNet "fix"

guru_kini · February 4, 2022, 6:21am

The ResNet is definitely an interesting concept and I was reading up more about this when I came across this post. The information loss grows towards the deeper layers is kind of intuitive to understand. So the deeper layers are contributing a “residual” (delta) value to the more “robust” values from the earlier layers.
But then the question is: does it even make sense to create very deep networks when the deeper layers are contributing only residual amounts? In general, are there ways to understand what is the optimal depth? I am on Week2 - ResNet videos, perhaps this is covered in one of the later videos?

paulinpaloalto · May 26, 2022, 10:40pm

These are not easy questions with crisp answers, which is probably why no-one answered back when you originally asked it.

I don’t know definitive answers, but here are some thoughts:

I am not an expert in the field and all I know is what I have heard Prof Ng say in the lectures in the various courses here. He does not give a general method for determining the number of layers. If there were such a method, he would probably have mentioned it. Generally speaking, he says to start with an architecture that worked well on a problem that is as similar as you can find to the problem you are trying to solve. Then you have to use the various evaluation methods that he describes in Course 2 and Course 3 to decide how to improve the performance, which might require changing the number and sizes and types of the various layers.

But residual amounts are not zero, right? The proof is in the pudding: do the deeper networks work better or don’t they? If they didn’t work better in at least some cases, then people wouldn’t use them. The point of Residual Nets is that the skip layers have a moderating effect on the training and allow you to successfully train deeper networks than you otherwise could. That’s what Prof Ng says in the lectures, as I recall.

Topic		Replies	Views
ResNets Question Convolutional Neural Networks coursera-platform	5	593	June 20, 2024
Why ResNets work? weight decay causes activations to be same Convolutional Neural Networks coursera-platform	2	435	July 10, 2023
Sense of ResNet Convolutional Neural Networks coursera-platform	1	492	May 16, 2023
How does ResNets work? Convolutional Neural Networks coursera-platform	2	628	June 13, 2022
Understanding of Add & Normal layer NLP with Attention Models week-2	2	463	June 26, 2023

The need for deeper networks and the ResNet "fix"

Related topics