Why don't we use derivate of cost function and make it zero to find local minimum

paulinpaloalto · October 24, 2023, 3:38am

As Tom says, there is no “closed form” solution in any other case but that one. It’s a perfectly reasonable question to ask why we don’t do the “set the derivative to zero and solve” in order to find the minimum, but it just turns out that it makes things more complicated, not less, in the case that there are no “closed form” solutions. Setting the derivative to zero and trying to solve just gives you another equation that you have to solve with some type of “iterative approximation” method. In that case, you’d probably use the multidimensional analog of Newton-Raphson to find the zeros of the first derivative. But think about what that means: now you need the second derivative of the cost in order to find the zeros of the first derivative. That’s more work and doesn’t really give you any advantage.

In the cases we are dealing with, it just turns out to be simpler and computationally more efficient to apply Gradient Descent directly to the cost (loss) function itself.

Topic		Replies	Views
Finding local minima of Cost Function Neural Networks and Deep Learning	2	535	May 25, 2021
Why Gradient Decent is required Neural Networks and Deep Learning	3	653	October 28, 2022
Why don't we get the minimum of a function mathematically instead of running gradient descent? Supervised ML: Regression and Classification week-1	4	556	August 8, 2022
Equate the derivative of cost to 0 zero to get the weight 'w' Supervised ML: Regression and Classification week-1	4	479	June 12, 2023
Gradient descent and derivatives Neural Networks and Deep Learning	2	360	October 6, 2023

Why don't we use derivate of cost function and make it zero to find local minimum

Related topics