Week2, logistic regression 2 questions

Igor_Goldberg · July 24, 2023, 9:01am

Hello every one’ I am new here, please help me with 2 issues (not sure if need to separate them into 2 different topics)
Question 1

Lets assume that in my case I have a linear regression and its predictions as I understand should be translated to predictions of logistic regression via the sigmoid function.
My question is like this:
Case 1)
Do I understand right that if I was doing only linear regression than best-fit line could be lets say some y = a1x+b1
Case 2)
But if I intend to make with the same dataset logistic regression then the best fit straight line (before the sigmoid function aplied) is not the same line as in case 1, but be some another straight line y = a2*x+b2
Am I right?
----------
Question 2:
Can Logistic regression work also on a dataset which is predicted by nonlinear regression
? I mean can be a sigmoid function applied on non linear function output?
--------
Thank you very much

rmwkwok · July 24, 2023, 9:14am

Hello @Igor_Goldberg,

I think your understanding on question 1 is correct, but you can verify by actually fitting a linear regression model and a logistic regression model on the same dataset of binary label.

For question 2, if you have a trained non linear function, and you use it to convert all of your samples to outputs, and then you take those outputs as input and train them to a logistic regression model - yes, you can absolutely do that. However, in my described case, we are not training the two models at the same time, but one after another. However, if you want to train both at the same time as one model, it is also possible but that may no longer be a logistic regression model.

Cheers,
Raymond

Igor_Goldberg · July 24, 2023, 9:44am

Thank you very much.
regarding your suggestion in question 1, it is actually what I did in excel to check myself

for now I do not know yet how to automate the a and b coefficents best fitting, guess it has to do with next topics and back-prop.

rmwkwok · July 24, 2023, 9:58am

Alright! That is going to be a fun thing to verify later on!

Raymond

Igor_Goldberg · July 25, 2023, 4:03am

Regarding the question 2, I thought a bit, and maybe I was not clear. I will formulate again if I may:
Sometimes to fit data better we do not use straight line (y=ax+b) but we use a polynomial of higher degree. In a video about logistic regression we see an example where we input into a sigmoid function(outer shell) the output of linear regression of a first degree(inner shell).
A)
My question was not that advanced as you maybe thought, I just asked theoretically will the concept of logistic regression work if I input to the logistic function the output of nonlinear regression. (I didn’t mean that I train things separately like transfer learning or something)

B) Asuming that in A the answer is “Yes”, here is another question:
I have m=10 datapoints each one has only n=1 predictor, lets say they go up, and then after some peak go back down. it is quite obvious that for NON logistic regression here will be best fitted come kind of 2d degree polynomial(say parabola style). Now, if I DO want a logistic regression here, i will input the inner function output into sigmoid but weights that are being changed during the back-propagation are weights of the first function (not the sigmoid) :
Thus, after best weights are fitted will I always discover that the inner shell remains 2nd degree polynomial or it may have changed maybe to 3d degree or maybe became 1st degree though the data was visually arranged in a parabola style?

rmwkwok · July 25, 2023, 9:35am

Hello @Igor_Goldberg,

Let me put it in the way we usually do it, and please see if you can adjust your current way of thinking into how we do it?

Let’s say you have 3 data points and each point has one feature only, and it is a binary problem:

x_1	y
3	0
4	1
5	0

Now you suspect that you need some higher degree polynomials of x_1 for this logistic regression problem, but you have no idea to what degree of polynomial you need, then the first thing is to “engineer” some polynomial features:

x_1	x_2 = x_1^2	x_3 = x_1^3	y
3	9	27	0
4	16	64	1
5	25	125	0

Now, I said you have no idea to what degree you need, but then how can you decide that you needed x_2 = x_1^2 and x_3 = x_1^3? The answer is: you need to guess it. It’s OK that you don’t know, but you can guess. You need to try to what degree you will need to have a good fit. In my above case, I am trying to get to the third degree.

Then you normalize this new dataset and fit it to a logistic regression model like this:

y = sigmoid(w_1x_1 + w_2x_2 + w_3x_3 +b)

Now, x_2 and x_3 are outputs of some non-linear functions because they are x_2 = x_1^2 and x_3 = x_1^3. You can also do something like x_4 = x_1^2 + \sin(x_1) if you want to. The thing is, you have to know how to do those conversions in advance so that you can create the new dataset like I did. If you CANNOT create the new dataset, then you might be thinking about something different.

Cheers,
Raymond

Igor_Goldberg · July 25, 2023, 2:30pm

Thank you! I guess I have got what I missed, that is why my questions probably didn’t make sense to you

Please tell me if I now understand right the following:
If I want to train logistic regression model then as a dataset Ground truth I actually do not get anything except the 0 or 1? I mean I do not get (as I thought thought before) any continuous Ground Truth data, right?
Another thing when you say:

The answer is: you need to guess it

You do not actually meant guessing it is more to fit right weights and also right weights amount automatically during the optimization and back-propagation process, right?

rmwkwok · July 25, 2023, 2:39pm

Hello @Igor_Goldberg,

That’s right. We expect ground truth labels for logistic regression to be either 0 or 1 only.
You guess/decide what polynomial terms are needed. You engineer those terms out to create the new dataset. Each of those terms (x_1, x_2, x_3) has an associated trainable weight(w_1, w_2, w_3). The training process decides the best values for those weights.
Training process = optimisation process = a process that uses gradient descent. Gradient descent uses back propagation.

Is this clear?

Cheers,
Raymond

Igor_Goldberg · July 25, 2023, 2:52pm

When you say:

You guess/decide what polynomial terms are needed

This decision is about to choose network architecture? Particlluary how many neurons and layers needed? (I have red that the more neurons there are in a hidden layer => the more weights are adjusted ==> the more weights equivalent to more degrees of polynomial ) Right?

rmwkwok · July 25, 2023, 4:07pm

For example, I decided to use the degree 2 polynomial term (x_1^2) and the degree 3 polynomial term (x_1^3) in my previous reply.

Can you see the meaning of polynomial terms? From your last reply, I am not sure if you understand what polynomial terms mean.

You take the square of x_1 to get the second degree polynomial term. x_1 to the power of three is the third degree polynomial term. That’s it. Don’t over complicate things.

They are just simple algebra, and we are talking about logistic regression and not multi-layer neural network.

rmwkwok · July 25, 2023, 4:22pm

y = sigmoid(w_1x_1 + w_2x_2 + w_3x_3 +b)

The above is how we can formulate a logistic regression. There is no multiple layers in such formulation.

If you are actually interested in a multilayer neural network, then logistic regression is not it.

Logistic regression is not the name for a multilayer neural network.

Week 2 isn’t yet fully about a multilayer neural network and you probably want to go through Week 3 and Week 4 first.

Igor_Goldberg · July 25, 2023, 4:24pm

Oh! finally you got what I am asking
But the expression y=sigmoid(w1x1+w2x2+w3x3+b) means that there is 1 layer with 3 neurons.
Thus my question was : do I understand you right that if I choose a 3d degree polynomial it will appear in this example as 1 hidden layer with 3 neurons. Right?

rmwkwok · July 25, 2023, 4:27pm

It is wrong. There is no hidden layer. It is one output layer with one neuron. That neuron accepts three input features and compute one output. That neuron has three weights and one bias. That neuron has a sigmoid function as its activation function.

Igor_Goldberg · July 25, 2023, 4:29pm

Weren’t we talking about a case with 1 independent variable?

rmwkwok · July 25, 2023, 4:31pm

y = sigmoid(w_1x_1 + w_2x_2 + w_3x_3 +b)

Out of the above symbols, can you tell me what is that independent variable?

Igor_Goldberg · July 25, 2023, 4:33pm

Yes. I want to predict by persons salary whether he is going to purchase or not my product
salary= independent variable
purchase or not = dependent var
btw
thank you for your patience

Igor_Goldberg · July 25, 2023, 4:35pm

out of those symbols
x1,x2,x3 are 3 independent vars

rmwkwok · July 25, 2023, 4:41pm

@Igor_Goldberg,

The story is this:

You start off with x_1 only, and y of course.
You think some non-linear term helps, therefore I suggested x_2 = x_1^2 and x_3 = x_1^3 as examples
You use a calculator to compute x_2 = x_1^2
You use a calculator to compute x_3 = x_1^3
Now we have x_1, x_2, x_3.
We have three input features.
We have three weights and one bias.
We have y = sigmoid(w_1x_1 + w_2x_2 + w_3x_3 +b)
That is logistic regression.
If you want to discuss y = sigmoid(w_1x_1 + w_2x_2 + w_3x_3 +b) in the language of neural network, it’s ok. It is one neuron that accepts three FEATURES, including the x_1 and the x_2 and x_3 that are generated based on x_1. They are THREE features.
That neuron is in a layer called “Output layer”, because it produces the output. It is NOT the hidden layer. A hidden layer is NOT an output layer. These are names.
That neuron has a sigmoid as its activation
point number 8, 9, and 10 are when you want to discuss a logistic regression formulation in terms of the vocabulary of neural network

Is the flow clear?

rmwkwok · July 25, 2023, 4:48pm

If you are looking for a different flow that accepts only x_1 as feature, and then a model that will learn the non-linear features out of just x_1, then there need to be hidden layers, but this is NOT called logistic regression.

Logistic regression DOES NOT HAVE multiple layers. You will not see multiple layers in a logistic regression problem in Course 1 Week 2.

If you want multiple layers, Course 1 Week 3 and Week 4 will cover that, but they are not called logistic regression.

rmwkwok · July 25, 2023, 4:57pm

I have been discussing your question assuming you want to model it with the logistic regression setting. If you indeed want to do it in a logistic regression, then this post is the flow. If you actually are not looking for that flow, then that means we should not be discussing it as a logistic regression problem, and I recommend you to go through week 3 and week 4 first.

Topic		Replies	Views
Logistic Regression using the sigmoid function Supervised ML: Regression and Classification week-module-3	76	287	February 7, 2025
W2 \| Logistic Vs Linear Regression \| Would you expect a bad model and why? Neural Networks and Deep Learning coursera-platform	26	797	August 8, 2022
Can we start with the circle equation as decision boundary? AI Discussions	23	472	January 25, 2023
Week 2 Programming Assignment: Logistic Regression with a Neural Network Mindset Neural Networks and Deep Learning week-module-2 , coursera-platform	3	330	February 13, 2024
Degree of polinomial vs regularization? Advanced Learning Algorithms week-module-3	16	526	January 18, 2023

Week2, logistic regression 2 questions

Related topics