Visualizing Squared Error Cost function for Logistic regression in 2D


I’m trying to understand why squared error cost function shouldn’t be used for logistic regression.

I understand that the non-convex nature of the cost function causes gradient descent to stuck at local minimas instead of converging to a global minimum.

But I’m not quite understand why or how the squared error cost function for logistic regression is non-convex. In order for me to better understand the situation, I decided to plot the squared error cost function for logistic regression using a simplified model (f=wx) like we did in the lecture C1_W1_12-“Cost function intuition” for the linear regression.

Note: I know that in the lab “C1_W3_Lab04_LogisticLoss_Soln.ipynb” we plot the squared error cost function for logistic regression but that was a 3D plot, I want to see it in 2D plot like drawn below right corner:

Here is what I did so far:
logistic_regresyon.ipynb (73.1 KB)

In the end I plot the cost function like below, but It seems it still is a convex function. Shouldn’t be lots of local minimas like the picture above?

There’s a very long and complicated mathematical proof online somewhere, that shows if you use the sigmoid() activation and the squared-error cost function, the 2nd partial derivative of the cost equation is not always positive. That’s a requirement for convex functions.

You may be able to find the proof online. I don’t have a link to it handy.


There is a little bump.

1 Like

Good eye!