Visualizing Squared Error Cost function for Logistic regression in 2D

Nourullah_Ozturk · February 15, 2024, 12:17pm

Hi,

I’m trying to understand why squared error cost function shouldn’t be used for logistic regression.

I understand that the non-convex nature of the cost function causes gradient descent to stuck at local minimas instead of converging to a global minimum.

But I’m not quite understand why or how the squared error cost function for logistic regression is non-convex. In order for me to better understand the situation, I decided to plot the squared error cost function for logistic regression using a simplified model (f=wx) like we did in the lecture C1_W1_12-“Cost function intuition” for the linear regression.

Note: I know that in the lab “C1_W3_Lab04_LogisticLoss_Soln.ipynb” we plot the squared error cost function for logistic regression but that was a 3D plot, I want to see it in 2D plot like drawn below right corner:

Here is what I did so far:
logistic_regresyon.ipynb (73.1 KB)

In the end I plot the cost function like below, but It seems it still is a convex function. Shouldn’t be lots of local minimas like the picture above?

TMosh · February 15, 2024, 7:22pm

There’s a very long and complicated mathematical proof online somewhere, that shows if you use the sigmoid() activation and the squared-error cost function, the 2nd partial derivative of the cost equation is not always positive. That’s a requirement for convex functions.

You may be able to find the proof online. I don’t have a link to it handy.

rmwkwok · February 17, 2024, 3:46am

There is a little bump.

TMosh · February 17, 2024, 5:37am

Good eye!

Topic		Replies	Views
Why is Squared Error Cost for Logistic Regression non-convex? Supervised ML: Regression and Classification week-module-3	1	606	July 31, 2022
Logistic Regression Derivative of J(w,b) Supervised ML: Regression and Classification week-module-3	12	1091	May 16, 2023
Use of squared error with sigmoid and applying gradient descent Neural Networks and Deep Learning week-module-2 , ai-discussions , coursera-platform	6	66	September 30, 2024
Nonconvexity -logistic Supervised ML: Regression and Classification week-module-3	3	443	November 5, 2023
Logistic Regression: Difference between cost function & gradient descent Supervised ML: Regression and Classification week-module-3	5	588	August 8, 2022

Visualizing Squared Error Cost function for Logistic regression in 2D

Related topics