To my understanding, the deeper a neural network be, the more complex patterns it can find in data. If this presupposition is true, then I assume if we have a high variance in the result, meaning we have overfitted the model to the data, then reducing the depth of the NN can make the model less complex, more general, and consequently, solve the problem of high varince. Is this conclusion true, and if not why?

Hi, @M_Mahdi_Gheysari!

Yes, you are right (in general). Nonetheless, making the network smaller means that it is simple, so it may not generalize better since it may not be complex enough to fit the data.