Hi,
I’m trying to understand why squared error cost function shouldn’t be used for logistic regression.
I understand that the non-convex nature of the cost function causes gradient descent to stuck at local minimas instead of converging to a global minimum.
But I’m not quite understand why or how the squared error cost function for logistic regression is non-convex. In order for me to better understand the situation, I decided to plot the squared error cost function for logistic regression using a simplified model (f=wx) like we did in the lecture C1_W1_12-“Cost function intuition” for the linear regression.
Note: I know that in the lab “C1_W3_Lab04_LogisticLoss_Soln.ipynb” we plot the squared error cost function for logistic regression but that was a 3D plot, I want to see it in 2D plot like drawn below right corner:
Here is what I did so far:
logistic_regresyon.ipynb (73.1 KB)
In the end I plot the cost function like below, but It seems it still is a convex function. Shouldn’t be lots of local minimas like the picture above?