Hi
I noticed that, squared error function will always lead to J(w), which won’t have any other local minima, but only global minimum, why is that so ?
and, why for other cost functions (like others have been used in neural nets - no squared error), having multiple minima ?
Thanks