Why is the formula for the logistic regression not shifted in the x-axis by 0.5 as follows?
f(x)=1/(1+e^-(wx+b-0.5)) ?
The reason for me thinking that we should add that “-0.5” is so that the value of the the sigmoid function and the value of linear function wx+b are equal at y = 0.5. (since 0.5 is half way between the two possible target values 0 and 1 in the data set)
So when the linear function predicts a value less than 0.5 then the sigmoid function predicts a value less than 0.5 and vice versa.
The problem I’m seeing with the default sigmoid function is that when the linear function is predicting a value between 0 and 0.5 (hence “False” output), the sigmoid function will predict a value greater than 0.5 (hence “True” output).
What am I missing?
The decision boundary is at f(x) = 0.5, because that’s the midpoint of the sigmoid function (which ranges from 0 to 1).
It is not the value of the linear function that is directly used to make the decision: it is the output of sigmoid. So the linear function output being between 0 and 0.5 does not predict “False”. The point is that sigmoid(0) = 0.5
and sigmoid is monotonic. So if the linear value is \leq 0, then sigmoid will be \leq 0.5 and we interpret that as predicting “False”. If the linear output is > 0 then sigmoid > 0.5 will predict “True”. We are interpreting the sigmoid output as being the probability of “True”.
The thing some people have suggested is “why don’t we use tanh
instead of sigmoid
and then use tanh(z) > 0
as True
”. That is discussed on this thread.