Early stopping - why dev set error starts to increase?

Szymon.P.Marciniak · September 7, 2021, 12:37pm

I have got question regarding Week 1 - Regularisation - Early Stopping video at around 4:30

There is chart showing training error decreases while we are running additional iterations. I dont understand why dev set error starts to increase. Actually I can misunderstund whole concept, I thought that:

1 I’m using training set to train my network - in each iteration I’m changing W and b values.
2 After I train my model I have got fixed W and b and what I’m doing, I’m tweaking hyperparameters. I’m not runing multiple iterations with different W and b, but I’m checking results for different hyperparameters while keeping W and b constant.

Is it correct?

kampamocha · September 7, 2021, 2:56pm

Hi @Szymon.P.Marciniak,

Think about what happened if the hyperparameter you are tweaking is the number of layers or the number of neurons, then you necessarily need to train again since you are creating connections that weren’t there before or removing some of them.

If you change for example the learning rate, then you need to train again, because this hyperparameter is tied to the training process, otherwise what is the point of changing the learning rate if you are not going to train with it.

And so on, the point I’m trying to make is that the hyperparameters change your model, so you need to train again to check wether it is improving or not.

Regarding the dev error increasing, that is a sign that the model is overfitting, because your model is adapting too much to your training set, which leads to a poor generalization, hence the bad behavior with the dev set.

Szymon.P.Marciniak · September 8, 2021, 6:54am

Thanks for your answer. I still don’t exactly understand why I should use a training set if I’m training networks from the beginning on the dev set? What data (parameters, settings, setups?) I’m transferring from training to dev set?

kampamocha · September 8, 2021, 8:22am

You don’t train on the dev set, this is used to evaluate your model. You always train in the train set.

Szymon.P.Marciniak · September 8, 2021, 8:48am

So I train on a train set, check on a dev set, and train on a train set after changing something?

I still don’t understand the chart with iterations. If I’m only evaluating my model on the dev set, I would feed my NN with inputs, run forward propagation, and check error (Y hat vs. true Y). There will be no multiple iterations - just a single iteration.

My understanding is that a single iteration is forward propagation + back propagation. Multiple iterations are run in the training phase (each iteration is updating my W and b).
So I don’t understand iterations in the dev set since we are not training in a dev set.

nqhoang · September 8, 2021, 10:21am

Actually, there is an evaluation step after the end of each iteration in which the model try fitting the validation input too.

kampamocha · September 8, 2021, 2:53pm

Yes, you can evaluate the dev set after each training iteration with the train set, to see how things are going, so you can have two measurements to compare, one for train and one for dev.

Thanks @nqhoang, and how this PSG thing is going? .

nqhoang · September 10, 2021, 8:37am

Pretty good here.
Getting along with my oldie Neymar and mbappe boi.

Topic		Replies	Views
Confusion about Training Set vs. Dev Set Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	817	December 19, 2021
Quiz-Practical aspects of Deep Learning Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	595	August 25, 2022
Training set error? Structuring Machine Learning Projects coursera-platform	1	649	October 22, 2022
Week1 Lecture1 Query regarding the point mentioned at time 10.30. Train/dev sets Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	519	March 8, 2022
Week1 Quiz Problem Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	545	May 25, 2022

Early stopping - why dev set error starts to increase?

Related topics