Course 2 Week 3 Removing the Bias term in Batch Norm


I hold the belief that removing the bias term b1 directly from the linear regression model y=a1x+b1 isn’t equivalent to training the model y=a2x. The distinction arises because a1 and a2 can be unequal (which is often the case). My question revolves around the differences that emerge when we remove the bias before the normalization step. Wouldn’t there be a potential bias in the weights if we eliminate the bias term directly, rather than training with the bias term and subsequently excluding it during the normalization step?


Hi @Overowser ,

I think that your intuition is correct. I would start an answer by saying that I would train the model with bias, and after training I would evaluate that bias. If the bias is too close to zero, I would probably decide to remove it from the model. In other words: experiment with the model using bias and not using bias. After training you can evaluate the bias like so:

from sklearn.linear_model import LinearRegression

# Fit the model
model = LinearRegression(), y_train)

# Get the bias term (intercept)
b = model.intercept_

And now that we know the “b” we can determine its impact and decide how to proceed. But definitively I think I would always start with bias. Lets see what others have to say.