Course 2. Regularization

Hongwen_Guo · December 5, 2021, 9:45pm

L2 regularization is to make weights small, to reduce overfitting.
Dropout is to remove some neurons in hidden layers. Why don’t we use L1 norm on weights to do the dropout? Something similar to Lasso, Ridge, and Elastic Net models. Any insights?

Thanks a lot.

balaji.ambresh · April 23, 2022, 3:05pm

Dropout doesn’t always shutoff the same neurons in a layer. It selects a few at random for each iteration.
Should you use L1, you’ll turn off a few connections from certain inputs over time and there’s no randomness involved.

There’s nothing stopping you from specifying a penalty on weights / bias

Topic		Replies	Views
DLS Course 2 Hyperparameter tuning week 1 Improving Deep Neural Networks: Hyperparameter tun	3	549	May 14, 2021
Dropout technique makes me confused Improving Deep Neural Networks: Hyperparameter tun	7	720	May 12, 2022
Week 1, Understanding Dropout Improving Deep Neural Networks: Hyperparameter tun	1	537	July 18, 2022
A doubt on dropout Improving Deep Neural Networks: Hyperparameter tun	4	515	August 17, 2023
Week 1: dropout vs reducing network? Improving Deep Neural Networks: Hyperparameter tun	14	1321	August 19, 2023

Course 2. Regularization

Related topics