On which set should Dropout be implemented?

Hansel_Gavin_Dias · August 12, 2022, 7:36am

On which set should Dropout be implemented ? Training set , Dev set or Test set ? And Why on that particular set ?

Elemento · August 12, 2022, 1:08pm

Hey @Hansel_Gavin_Dias,
Welcome to the community. In the video titled “Dropout Regularization”, Prof Andrew gave a couple of indirect references that Dropout is implemented on the training set. For instance,

“And so for each training example, you would train it using one of these neural based networks.”
“So, what you do is you use the d vector, and you’ll notice that for different training examples, you zero out different hidden units.”
“And in fact, if you make multiple passes through the same training set, then on different pauses through the training set, you should randomly zero out different hidden units.”

However, he doesn’t state it explicitly, perhaps this is what led you to ask this query. Similarly, he also makes a reference to dropout and test set,

“So what we’re going to do is not to use dropout at test time in particular which is …”

Now, let me try to give you 2 cents of my opinion as well,

We implement dropout on the training set when the model is training, and we don’t use dropout on the cross-validation and test sets, when the model is performing inferencing.
The reason to this lies in the purpose behind using dropout. As clearly stated in the video, we use dropout to decrease the extent of over-fitting.
Now, over-fitting happens when the model is training, not when the model is inferencing. Here, note that we get to know that the model is over-fitting after performing inferencing, but since the model is trained on the training set, and not on the dev/test sets, hence, the model overfits the training set, and not the dev/test sets.
And hence, it makes sense to use dropout on the training set, since that is the set on which over-fitting is taking place, and not on the dev and/or test sets.
Additionally, when we are making predictions on any set, be it test set or be it dev set, we want the predictions to be stable, i.e., we want the model to predict the same output given the same input.
However, during training, we are trying to reduce the inter-dependence of neurons on each other, and hence, another reason for using dropout on the training set and not on the dev/test sets.

I hope this helps.

Cheers,
Elemento

paulinpaloalto · August 12, 2022, 2:53pm

Elemento did a great job of covering how this applies specifically to Dropout, but you can also make a more general statement than that: regularization by definition is only applied during training. That applies to all forms for regularization: L2, Lasso, Dropout and others. The point is that regularization modifies the training to result in a different model, which we hope, of course, is a more accurate model with less overfitting. Then you apply the resulting trained model to any other datasets you have: cross validation, test or “real” input data to make predictions.

Elemento · August 13, 2022, 5:42am

I totally missed out the generic viewpoint from my answer, and focused on dropout only. Thanks a lot for completing my answer @paulinpaloalto Sir

Cheers,
Elemento

Topic		Replies	Views
Dropout Regularisation Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	547	February 7, 2022
Question about a common mistake of using dropout during testing Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	651	August 13, 2021
Course 2 Week 1 Programming Assignment Regularization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	709	September 10, 2021
Invereted dropout question in Quiz Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	531	June 10, 2021
Week1 Practical aspects of Deep Learning Quiz Q7 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	634	February 12, 2023

On which set should Dropout be implemented?

Related topics