Hi,

Looking at the example you give to justify why linear regression does not work for classification, it seems adjusting the decision threshold would help. What do you think?

Also what about polynomial linear regression? Couldn’t we approximate a sigmoid with a polynomial function?

Thanks.

One problem is trying to adjust the threshold, while you’re also learning the weight values.

Another problem is that if you want to make a prediction on new data, that might require a different threshold, but you can’t modify the threshold outside of training.

Using sigmoid normalizes the range of possible output values, so the threshold won’t need to move.

Yes, you could approximate the sigmoid. But that might require even more computation than just computing the sigmoid.

Thanks for the quick answer.

I thought I would pick the threshold manually after comparing the learned line with the training set. The threshold would not be a learned parameter of the model. In any case, the sigmoid approach avoids having to pick any threshold. Great feedback. Thanks.

On second thought, we also have to pick the threshold when using the sigmoid. The choice is made easier because the sigmoid is more discriminant thanks to its S shape. Then it is easier to choose it to adjust the model’s recall/precision balance so that, for instance, we don’t miss any malignant tumors.

There is almost never a good reason to use a threshold other than 0.5. That’s the value of symmetry between 0 and 1.