Mathematical Inquiry

Abdullah32 · October 18, 2025, 1:31pm

Hi folks!

Week 3 Lab.
I have a basic question I can’t quite understand. Why do we use the logarithm when calculating the loss in logistic regression? Wouldn’t it be simpler to just calculate the difference between the predicted and actual values to get the same result? It seems like the log is doing something similar, but in a different form.

Also, why do we use e (the natural exponential) in the sigmoid function? Why not just apply a simple threshold like in linear regression, where values above a certain threshold return 1 and values below return 0?
Could you recommend some math books to help build a strong grasp of these fundamentals?

virgokyle · October 18, 2025, 6:45pm

Hey Abdullah,

In a linear regression model you assume there’s a linear relationship between the predictor(s) x(s) and some numeric / continuous outcome, y. Your model says: for each one-unit increase in x, y increases by w on average. That’s why the loss is mean squared error; you’re literally trying to minimize the squared distance between the predicted and actual numeric values.

In logistic regression, the y you are trying to predict isn’t continuous, it is binary (0 or 1). Because of that, you can’t model y linearly - probabilities have to stay between 0 and 1. So instead you have to model the log odds of y=1 (something happening) as a linear function of x. After modeling the log odds of y=1, you can convert the log odds back into a probability.

I hope that helps!

A great way to learn stats fast? Check out Josh Starmer’s YouTube channel, StatQuest.

TMosh · October 18, 2025, 10:15pm

The two types of regression have different goals.

Linear regression attempts to create a model that mimics the data, and the output range is all real numbers.
Logistic Regression attempts to create a boundary that splits the data into two regions, and its output range is limited to 0.0 (False) to 1.0 (True).

Abdullah32 · October 19, 2025, 6:40am

Thanks for your response! Here’s what I noticed: I initially wondered why we can’t just calculate the difference between the predicted and actual values, but then I considered the case where the actual value is 1. If we use an error function like (1−x) * constant to scale the error, I realized this approach wouldn’t lead to an infinite value and would always intersect the y-axis. Specifically, at a predicted value of 0, we’d get a large constant value instead of infinity.

On the other hand, the gradient of this linear error function stays constant, whereas in logistic regression, the logarithmic error function behaves differently. The gradient of the log function starts out steep and gradually becomes smaller as we approach the optimal solution. This characteristic makes the log function more suitable because it allows for larger updates initially (when the model is far from the optimal) and smaller, more refined updates as the model converges, which helps the model approach the minimum error more effectively.
Also, thanks for recommending YouTube channel, I really appreciate it. I’m excited to learn from it.

Topic		Replies	Views
Confusing lost function for logistics regression Supervised ML: Regression and Classification week-module-3	2	508	April 26, 2023
A simplier and different logistic loss function? Supervised ML: Regression and Classification week-module-3	2	537	December 31, 2022
How Logistic Regression Works? Neural Networks and Deep Learning coursera-platform	5	855	August 12, 2022
Can logistic regression be replaced with ordinary linear regression Supervised ML: Regression and Classification week-module-2	24	1089	July 27, 2023
Why the linear regression and classification have identical Gradient Function? Supervised ML: Regression and Classification week-module-2	5	708	February 11, 2023

Mathematical Inquiry

Related topics