Can logistic regression be replaced with ordinary linear regression

Yes, correct @vasyl.delta !

This would be a One Vs Rest approach, see also:

This would also be an option:

OneVsOneClassifier constructs one classifier per pair of classes. At prediction time, the class which received the most votes is selected. In the event of a tie (among two classes with an equal number of votes), it selects the class with the highest aggregate classification confidence by summing over the pair-wise classification confidence levels computed by the underlying binary classifiers

See also: 1.12. Multiclass and multioutput algorithms — scikit-learn 1.3.2 documentation

Best regards
Christian

Thank you very much!
You also kindly mentioned in regard to multiclass regression:

  • or alternatively if the loss minimised is the multinomial loss fit across the entire probability distribution, *
    Could you please give a short hint of what it stands for.

This is an approach in scikitlearn and it is described here:

multi_class {‘auto’, ‘ovr’, ‘multinomial’}, default=’auto’
… For ‘multinomial’ the loss minimised is the multinomial loss fit across the entire probability distribution

Note that in this concept linear probability predictor functions would be used. I would suggest to read through this source where the concept and math is explained so that we can ensure that all calculated probabilities sum up to 1.

Best regards
Christian

Thank you so much, very important tutorial!

1 Like

I would like to share some thoughts regarding cost function for the logisitc regression (with sigmoid activation function)
In the lecture Prof. Andrew told that cost function for logistic regression (based on binary cross enthropy) is convex. I tried to check it on several examples and got that it is not so clear (see example on the plot).

Also in some artificial cases (e.g. x=[0 1 2 3 4 5 6 7 8 9];y=[0 0 0 0 0 1 1 1 1 1]) the cost function does not have a minimum at all (it decreases asymptotically to zero as the weight w goes to infinity).