Question to loss >>> cv_loss

Chris.X · April 22, 2022, 11:31am

Hey,

As we know that when the loss and cv_loss are very closer to each other, it is good, to overcome the overfitting.

When loss <<< cv_loss, it is overfitting, we have solution like L2_reg, dropout etc…

However, my question is how about loss >>> cv_loss and the gap is relatively large, then what kind of solution shall we have through?

Thanks

gent.spah · April 22, 2022, 12:12pm

You mean the training loss is much bigger than the validation loss? Then this is the case of overfitting.

Chris.X · April 22, 2022, 12:15pm

take a look here, the validation loss goes blow and smaller than the training one.

WDYT @gent.spah ?

The model performs better on CV than on the training set.

gent.spah · April 22, 2022, 12:19pm

It can happen but probably that validation data is not representative of the dataset well enough, maybe if you use cross-validation with folds then it might give you a better overview.

Chris.X · April 22, 2022, 12:24pm

what do you mean with folds ? .

gent.spah · April 22, 2022, 12:28pm

Folds means; let’s say divide the dataset to 5 sets (folds) and each time use one fold for validation and 4 for training, but we go 5 rounds of training -validation and model is validated on the entire dataset so we get a better overview on model performance. Check Cross-validation

paulinpaloalto · April 23, 2022, 4:59pm

This is a way to do the split between training and cv data that ensures they are from the same statistical distribution. In the basic case you posit here, the most likely explanation is that the cv data is somehow “easier” than the training data. There are lots of complex scenarios where this can happen. Prof Ng spends quite a bit of time on this sort of issue in DLS Course 3. If you haven’t been through that yet, it is really worth your time. Or if you haven’t looked at it in a while, scan through the titles of the lectures and you’ll see some that should sound like they address this type of issue.

paulinpaloalto · April 23, 2022, 5:02pm

Note that the specific technique of folds that Gent refers to is not directly covered anywhere in DLS by that name anyway. I first heard the term just in the last week on a discussion thread here, which is worth a look: K-fold cross validation - #2 by paulinpaloalto

Chris.X · April 23, 2022, 11:24pm

dude, correct me if I am wrong.
The reason is bc of Data-Mismatching, right ?

paulinpaloalto · April 24, 2022, 12:44am

Yes, that lecture is a great place to dig deeper on this situation.

gent.spah · April 24, 2022, 6:51am

Hi Paulin, your advice is always superb no doubt. Its been a while I took DLS and I will check it as well, definetely there should be better techniques than cross validation with folds because cv with folds is computationally expensive and suitable only for small datasets. It just came into my mind because I have been reading about it recently. I was going through the MLOPS course 2 some time ago and they use tensorflow data validation to build a schema for training which is then compared to the validation data schema. If the schemas are not similar then work on features is done (all sorts) or maybe another shuffled partition is taken and checked again and worked on it again so ultimately the schemas are similar before progressing further with training.

Topic		Replies	Views
Course 4 Week 2 Project 2: Why Training Loss Is Higher than Validation Loss? Convolutional Neural Networks coursera-platform	2	443	August 2, 2023
How to ensure your models are not overfiting? AI Discussions ai-discussions , data-centric	2	120	November 16, 2022
Loss function and Accuracy AI Discussions ai-discussions	9	247	March 20, 2024
Training set vs cv set Advanced Learning Algorithms week-module-3	3	83	June 19, 2024
Increasing loss for validation set Convolutional Neural Networks in TensorFlow week-module-1	2	535	June 6, 2022

Question to loss >>> cv_loss

Related topics