Logistic Regression using the sigmoid function

The equation z = w \cdot x + b does not represent a probability directly. Instead, it is an intermediate value that is transformed by the sigmoid function.

This function’s output is always between 0 and 1, representing possibilities. The decision boundary occurs at z = 0 , where:

\sigma(0) = \frac{1}{1 + e^{0}} = \frac{1}{2}

When z = 0 , the model predicts a probability of 0.5.

The placement of the decision boundary is determined by the choice of w and b , not by manually shifting the function by subtracting constants like 0.5.

Can you tell me why you want to do this? Please explain mathematically using the terms and concepts presented in the course.

1 Like

What do you mean by an “…an intermediate value…” and “…transformed…”. Please use only mathematics and terms used in the course.

The problem is z = 0 produces a probability of 0.5 but z = w.x + b = 0 does not produce a probability of 0.5 since w.x + b = 0 occurs when w.x + b intercepts the horizontal axis of tumor size where the data set has a probability of 0.

By “intermediate value,” I mean that the expression z = w ⋅ x + b that it is not directly a probability.

The term “transformed” means that this value z is passed through the sigmoid function, which maps z to a value between 0 and 1. → (the transformation is essential because the output of the linear model is not constrained to the [0, 1] range required for a probability)

I have already done that.

It might be helpful to rewatch the course to gain a clearer understanding of why and where these values are used.

When z = w \cdot x + b = 0 , this does indeed result in the sigmoid output of \sigma(0) = 0.5 (which represents a 50% probability). Mathematically, z = 0 represents the point where the decision boundary lies. This decision boundary is where the model is equally likely to classify the data points as belonging to either class, and the probability of either class is 0.5.

I don’t think you understand the mathematics of logistic regression using the sigmoid function with an input computed from linear regression of the data set.

I’m not sure I understand what you’re expecting to get as an answer. Could you clarify what you’re looking for?

Which model? The logistic regression model or the linear regression model?

The logistic regression model. We’re not talking about linear regression, are we?

Please read my previous posts including the original one.

Linear regression is used to compute z.

Yes, linear regression is used to compute z, but the overall model is logistic regression because we use the sigmoid to map the output to a probability.

Ok, so tell more about what the linear regression model w \cdot x + b looks like when plotted on the graph of tumor size against probability?

I’m not sure what you mean by “…do this…”. What are you referring to?

@ai_is_cool, I think the confusion is coming from thinking of the linear regression as a separate step that we are trying to optimize independently. This is logistical regression, where we are trying to optimize using the function σ(z) where z is wx + b. As you point out, when z=0, it corresponds to a probability of 0.5, and this happens when z = w.x + b = 0. That’s what we’re trying to solve for.

See this screenshot from Prof. Ng’s logistic regression Gradient Descent Implementation video that highlights the difference in the function we’re using for logistic regression:

2 Likes

Hello Wendy,

Thank you so much for your reply.

Isn’t linear regression used to determine w and b from the dataset of tumor size and diagnosis of benign - 0 or malignant - 1?

And then the linear regression model w \cdot x + b is used in logistic regression by setting its value of z equal to the linear regression prediction model?

I can see you have labelled a mathematical expression from the course as “looks like linear regression” but this is exactly linear regression.

Stephen.

Hi Stephen,

That label in the screenshot that says “looks like linear regression” is actually from Prof. Ng’s video. He is showing how the general idea for solving with logistical regression looks like linear regression. In both cases, you are solving for w & b, but for logistic regression, the function you’re using is the logistic regression function shown in the screenshot.

Important to keep in mind that we’re solving for w & b in training, so for logistical regression, we need to use the full logistical function so that the sigmoid function is involved in helping determine w & b.

Hi Wendy,

Thanks for your reply.

So how are w and b updated and determined in logistic regression if linear regression is not used?

Stephen.