High Bias, Variance


I have a question. Does a deeper network help with reducing the variance problem? I have used a shallow network that I created (4 layers) and compared with transfer learning with MobileNet V2 using the same dataset. The MobileNet showed lower variance. Can somebody explain this?


A deeper network can lead to higher variance. However, this issue can be mitigated if you employ an appropriate regularization technique, such as L2 or dropout.

The MobileNet has better network settings compared to a shallower network.

1 Like

What do you mean by “better network settings”?
So this means that a deeper network has poorer generalization? So this means that if I want to solve a high variance problem, I should use a shallower network?

By saying “better network settings,” I mean the optimal choice of hyperparameters for the network, including the number of layers, kernel size, learning rate, regularization, and so on, that helps the network generalize well.

I suggest you rewatch the week-1 lecture videos of DLS C2.