Why don't we get the minimum of a function mathematically instead of running gradient descent?

Mark_Zaky · August 7, 2022, 9:35am

suppose that our cost function which is J(w, b)

Why can’t we just set J’(w, b) = 0

setting the derivative of the cost function to zero, getting all the values which satisfies the equation and choosing the smallest one

gent.spah · August 7, 2022, 9:58am

If its a low degree polynomial that might be easy to do but of its a complex multidimensional function its very hard to achieve. Also computing gradients in that way i.e. finding solutions to equations might be more computationally expensive than the cost approach.

TMosh · August 7, 2022, 2:14pm

That method is called the Normal equation.

It is only practical for small data sets due to the computational complexity.

TMosh · August 8, 2022, 12:31am

You might find this article useful:

Mark_Zaky · August 8, 2022, 9:13am

A great article. Thank you!

Topic		Replies	Views
Why don't we use derivate of cost function and make it zero to find local minimum Supervised ML: Regression and Classification week-1	5	530	November 16, 2023
Equate the derivative of cost to 0 zero to get the weight 'w' Supervised ML: Regression and Classification week-1	4	480	June 12, 2023
Machine learning specialization week 1 Supervised ML: Regression and Classification week-1	2	408	August 22, 2023
Cost Function and Gradient Descent Supervised ML: Regression and Classification week-1	3	327	October 27, 2023
Normal equation vs gradient descent Supervised ML: Regression and Classification week-2	23	690	June 22, 2023

Why don't we get the minimum of a function mathematically instead of running gradient descent?

Related topics