Why do we use b in the equation?

It is not entirely clear which equation you are referring to. I assume you mean y = a*x + b? In that case you could see it as the average target value.

If all features (X) are normalized, than the average case would be all X’s are zero. The only term left is b.

Hello @vivek16pawar

Welcome to the community.

Here is the link to an article where we discuss the same topic.