Why don't we include regularization term in the training dataset?

Christian_Simonis · December 8, 2023, 6:59am

Hi there,

in addition to the excellent replys of my fellow mentors:

The training data set is just pure data. When training the model (=fitting parameters), we optimize:

the model fit to the training data so that a good performance is reached on training data
and steer the complexity of the model with regularization

Afterwards we are done with training and there is nothing more to regularise at this point.

Then we just test how good the training was with respect to reality and new data which the model did not see before. Therefore, we provide the model with a unseen test set. Now we can evaluate how well the model performs on this new test set.

So simplified we can say:

if the model performs clearly worse compared to the performance on the training data, this indicates overfitting and this means that model complexity was potentially too high (which we were steering in the training with regularization) given the available data
if performance of the model on the new test set is comparable with the performance of the model on the training data and this suits your business requirements, this is a good sign, that regularization was effective and you could prevent overfitting, by keeping the model complexity in a state, where the model can generalise well (and does not overestimate noise too much)

Here more info, which also touches upon the validation data set: https://community.deeplearning.ai/t/regular-math-s-vs-ml/250791/2

Hope that helps!

Best regards
Christian

Topic		Replies	Views
Week3 cross validation Advanced Learning Algorithms week-module-3	14	65	June 14, 2025
Doubt For Evaluating a Model Advanced Learning Algorithms week-module-3	1	388	July 21, 2023
Why regularization term is not included when calculating error Advanced Learning Algorithms week-module-3	1	513	April 25, 2023
Dropout Regularisation Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	543	February 7, 2022
Need clarification Advanced Learning Algorithms week-module-3	3	523	November 30, 2022

Why don't we include regularization term in the training dataset?

Related topics