Confused about systematic approach to regularization

Hi

I’m a little bit confused about the approach to regularization. Its application is pretty clear; reduce variance.

The confusion arises from a couple of hints/statements Dr. Andrew makes in the videos of week1 but does not concedes with the conclusion of assignment 2 on the week1.

As per my understanding ( I might have misunderstood) Dr. Andrew clarifies the following in the videos of week1 :
1- in the early days of machine learning the practitioner needed to be aware of the bias/variance tradeoff. trying to address one used to effect the other in the past. However this is less of a concern in the deep learning era.

2- applying regularization and going deeper should not affect the bias of model

3- Dr. Andrew purposes tackling bias and variance separately. his ( personal) approach is to fix bias issue first then move to fix the variance issue. He also find early stopping a bit problematic when attempting to apply this approach since the separation of both variance and bias is not possible.

However when completing assignment 2 of week1 one. We will notice in the conclusion that regularization does affect the bias especially the train accuracy goes down a little bit (93% down from 95%).

Looking at that the conclusion and the statements/hints provided by Dr. Andrew. how should we approach regularization:
1- should I expect train performance to take a hit but not significant one?
2-if it takes a big hit. should I go deeper? or reduce the regularization? to improve the performance
3- is there a systematic way to know if we need to have a bigger/deeper network or loosen/tighten the regularization factors whether lambda/keep_prob?

Thank you

Hi Mousa,

Welcome to the community.

Here’s a [thread] (Week1 assignment2 ( L2 regularization ) - #2 by Elemento) that can help you out with your query. Thanks!