Why don't we include regularization term in the training dataset?

Hi there,

in addition to the excellent replys of my fellow mentors:

The training data set is just pure data. When training the model (=fitting parameters), we optimize:

  • the model fit to the training data so that a good performance is reached on training data
  • and steer the complexity of the model with regularization

Afterwards we are done with training and there is nothing more to regularise at this point.

Then we just test how good the training was with respect to reality and new data which the model did not see before. Therefore, we provide the model with a unseen test set. Now we can evaluate how well the model performs on this new test set.

So simplified we can say:

  • if the model performs clearly worse compared to the performance on the training data, this indicates overfitting and this means that model complexity was potentially too high (which we were steering in the training with regularization) given the available data

  • if performance of the model on the new test set is comparable with the performance of the model on the training data and this suits your business requirements, this is a good sign, that regularization was effective and you could prevent overfitting, by keeping the model complexity in a state, where the model can generalise well (and does not overestimate noise too much)

Here more info, which also touches upon the validation data set: https://community.deeplearning.ai/t/regular-math-s-vs-ml/250791/2

Hope that helps!

Best regards
Christian

2 Likes