why do we take an intercept b in regression ? it makes intuitive sense when its a 2d plot but how do we make sense out of it when we go to a situation where its multivariate regression etc also is this b similar to the b in ANN since ANN is nothing but regression if there is no activation function
The equation of a line is given by y = wx + b, where w is the slope and b is the intercept. if you set b= 0, the equation becomes y = wx and this line would then have to pass through the origin (y=0, x = 0) when the value of x = 0.
Supposing we want a line that does not pass through the origin - Let’s say that the dataset shows when x = 0, y = 1. To model such a combination of x and y we have to ensure that the line passes through y=1 when x = 0. Going back to the equation of the line, y = wx + b, where x = 0 in this case. Only by setting b = 1 can we get y = 1
Moving on from the 2-d case to the n-d case, the equation becomes: y = w_1x_1 + w_2x_2 + ....+w_nx_n + b
In the case of y = 1(say) when x_1,x_2,....,x_n = 0, we again take the help of b to be able to satsify the equation.
In a nutshell, the value of w can be used to vary the angle/slope of the line, while the value of b can be used to move the line above or below the origin.
This b is the very same b that we come across in Neural Networks. Of course, in the case of Neural Networks, it is not a single Logistic Regression, but several logistic regression units stacked up. Hence it will not be a single b, but several of them, each corresponding to a separate logistic regression unit.
aaah it makes sense now thank you : )