Regularization of output layer in neural network


In the week 3 assignment of C3, while using the complex neural network model with regularization, the kernel regularizer is only applied to the first 2 layers but not the output layer. But the output layer also contains w and b parameters and the activation is linear. Is there any reason behind this? Do we need to skip the regularization for the output layer?

model_r = Sequential(
Dense(120, activation = ‘relu’, kernel_regularizer=tf.keras.regularizers.l2(0.1), name=“L1”),
Dense(40, activation = ‘relu’, kernel_regularizer=tf.keras.regularizers.l2(0.1), name=“L2”),
Dense(classes, activation = ‘linear’, name=“L3”)
], name=“ComplexRegularized”

Hi @bhavanamalla great question!

Regularization is typically perform to the training layers, and usually we skip for the output layer, however as machine learning is an iterative process you could experiment adding regularization to the output layer and see if that work better for your dataset.

I hope this helps!

Hi @pastorsoto ,

Thanks for your reply.

When adding regularization to the output layer for this particular assignment(complex model with regularization), the performance is highly degrading compared to the one without the regularization of the last layer. I am wondering why is that the case.

Regularization is a way to penalize the model, but if you penalize the model too much it will underfit your model creating a worse performance since it’s not able to learn the data properly.

I hope this helps