Concave Convex functions

paulinpaloalto · June 15, 2024, 3:17pm

Yes, as Tom says, the normal cost functions we use for linear regression (MSE) or logistic regression (BCE) are convex in those cases. But note that if we use those same cost functions with multi-layer neural networks (perceptrons), then they are no longer convex. The point is that the cost function is the complete function that takes the parameters of the network (weights and bias values at all layers) as input and maps that to the cost, based on the training data. Of course we are also in very high dimensions, since it’s typical for a neural network to have thousands or even millions of parameters. The cost surfaces are very complex and impossible for us to really visualize with our human brains evolved to perceive in only 3 spatial dimensions.

There is a lot of math going on here and the questions have been studied and (as Tom also pointed out) the experts have figured out how to make gradient descent work in a lot of cases. If you take DLS, for example, you’ll learn about more sophisticated techniques like Adam, RMSprop and so forth that are useful for getting efficient convergence. Here’s a thread which discusses the general point in more detail and also links to some other information about this, including a paper by Yann LeCun’s group showing that for sufficiently complex networks, there are good solutions that we have a good probability of finding with gradient descent.

Topic		Replies	Views
Cost function - How can we make sure that we end up in the global minimum and not one of the local minima Supervised ML: Regression and Classification week-2	2	831	December 3, 2022
Cost function convex why gradient decent Supervised ML: Regression and Classification week-1	5	597	June 13, 2023
How does cost function of logistic regression create a convex that the gradient decent can use? Supervised ML: Regression and Classification week-3	1	452	May 20, 2023
What does the cost function of logistic regression look like? Neural Networks and Deep Learning coursera-platform	2	425	January 4, 2024
Cost function stuck at local minima Neural Networks and Deep Learning coursera-platform	8	1450	July 5, 2024

Concave Convex functions

Related topics