How to Diagnose Overfitting in Complex Models Like U-Net?

I understand how to detect overfitting in simple models such as linear regression — for example, by training models with different polynomial degrees and observing the bias–variance tradeoff.

However, I’m confused about how to apply similar ideas to large and complex models, especially U-Net, and I would like to ask:

  1. Since the architecture of U-Net is already fixed, how can I systematically evaluate bias/variance or diagnose overfitting?
    In linear models, I can easily control model complexity by changing polynomial degrees, but U-Net does not seem to have such straightforward “complexity knobs.”

  2. Training a U-Net takes several days.
    If I must rely on training many different model variants to detect overfitting, it could take months or even years.
    Is there a more practical way to monitor or estimate overfitting during training without repeatedly training many full models?

Any advice on how to properly evaluate or detect overfitting in deep architectures like U-Net would be greatly appreciated.

hi @Che-Wei-Hsiang

One need to understand more importantly about the training and testing data they are working on to check overfitting and undercutting of any ai model, be it a small model or larger model.

Smaller model it can be easier to detect the issue as you can review the data with linear approach where as in larger data like your mentioned when you need to train on U-Net, you need to make sure to check if the data you are working on is not highly imbalanced, like if it is classification model, it should one class more often than other class as the model tends to adapt to the presence of class which is higher in presence..

This can also happen if the model was created on a highly imbalanced data, where the validation data had more number of one class than compare to other, but in the training if that class were less in number than model would very less accuracy due to under presentation of other class in the training data.

Here is a link

Feel free to ask if any question.

Hello, Tommy @Che-Wei-Hsiang,

As this lecture suggests, we can diagnose bias/variance with training cost and CV cost. This diagnosis is independent of model size.

“Changing polynomial degree” is more about how to deal with overfitting and not about diagnosis. For addressing overfitting, the slide below (from this lecture) listed 2 more approaches that we can apply without changing the model’s trainable parameters. “More training examples”, for example, requires understanding of your data, as Deepti emphasized, which is, apart from the MLS, extensively discussed in the Deep Learning Specialization (DLS) Course 3. Data argumentation, a strategy of getting more examples, is introduced in DLS Course 4 Week 2 lecture “Data argumentation”.

Other applicable regularization approaches include Dropout and Batch Normalization, but they are not covered in the MLS. If you are interested, google about them or check out the videos in the first half of the Week 1 of DLS Course 2. Batch Normalization is introduced in Course 2 Week 3, but I forgot if it had been discussed in the context of regularization.

None of the DLS material mentioned here requires you to first finish previous DLS material. You can start right the way if you want, and the DLAI Learning Platform should be offering access to all those DLS lecture videos for everyone even without a paid subscription.

If you go for a search, you will find more regularization methods. The methods mentioned here won’t need to you change the architecture related to trainable parameters. L2 regularization, for example, essentially only add some terms to the cost function, and practically you might need to add it to the trainable layers, but this still means that the architecture is unchanged.

As the MLS lecture explained, we compare a model’s training/CV costs and tell if it is overfit. In other words, we don’t need many models to “detect” overfitting.

However, after you try to address overfitting, you will inevitably have to do training again which, I believe, was the heart of your question - not about detecting but more about the resources that go into the trial and error process. For this, you might consider to start from a well trained UNet model and freeze all but the final layers - this mean much less trainable parameters and much less training time. This also helps fight overfitting because you reduce the number of trainable parameters. Besides, you might save model checkpoints (like every epoch or with early stopping) and restart from a good checkpoint instead of from scratch. Also, deviation of training/CV data can cause overfitting, and it is also possible to identify derivation by training a model with a random subset of the training set - smaller training set means less training time. I hope these will give you at least some idea.

If you have time, spending it on addressing overfitting gives you precious model training experience, so I see it as a gain. In some sense, I think model training is partially art.

Cheers,
Raymond