When should we resort to deep ResNets?

In dealing with image categorical classification problems, when should we resort to deep residual networks, vs., say, sequential with 10-20 layers? Does it depend on the number of categories (more categories, deeper nets)? The difficulty of the problem (categories that are similar, thence difficult to distinguish)? Examples will be appreciated…

Hi EduoardoChicago,

It’s a performance issue. See, e.g., Understand Deep Residual Networks — a simple, modular learning framework that has redefined state-of-the-art | by Michael Dietz | Medium

1 Like

Thanks, Reinoud. Here is a more precise formulation, and example, of basically the same question. I just implemented the ResNets 50 model in the Assignment. Looks good. I want to use it to train a “local” set of images, for a different categorical classification (the details don’t matter at this point). The total training set that I have consists of about 18 million pixels, grayscale, byte resolution (0-255). Given that the ResNets 50 has 23,546,886 trainable parameters, a bit more than the total pixels in the training set, won’t that already announce overfitting? (Overparametrization as we would say in modeling).
Thanks for your help

Hi EduardoChicago,

Here is a post about ideas why deepnets are not overfitting in such a case:

As you will see, the dust has not fully settled on this issue.

1 Like

I read the article, superficially, and learnt a lot --if I accept all that is written. The “intrinsic dimension” of a model --and the fact that it is so much smaller than the # of parameters-- is one of the most intriguing ideas. Counterintuitive, but I did not spend enough time with the demonstration to be fully convinced.
Thanks a lot for the tip.