Gradient descent formula

JPGuittard · May 26, 2023, 10:31pm

I’m not a mathematician and I’m trying to get an intuitive sense of the math here.
Is this a correct alternative explanation to the formula?
Imagine a blue dot that marks our position.
Alpha ( the learning rate) represents how far along you move that blue dot on the horizontal axis (w)
and the derivative of the cost function represents the slope from the prior position of the blue dot to the new position of the blue dot? (and because it’s a derivative it assures that our blue dot stays on the curve instead of ending up somewhere else. )

TMosh · May 26, 2023, 10:34pm

Yes, that seems like a good analogy.

JPGuittard · May 26, 2023, 11:02pm

Is it the case that the derivative can’t be calculated until we update b (or w) in the gradient descent formula

TMosh · May 26, 2023, 11:18pm

No, the derivative is calculated based on the current ‘b’ and ‘w’ values.
Then you use the derivative to compute new ‘b’ and ‘w’ values.

JPGuittard · May 26, 2023, 11:28pm

OK. But a derivative implies a change, and there is no change in b if I use the current value of b.
Sorry about all of my questions. I’ll study derivatives tonight until I have a handle on this.

JPGuittard · May 27, 2023, 12:42am

It’s been years since I had calculus, so watched these videos and found them helpful to get an understanding of derivatives in general and then derivatives as used in the gradient descent algorithm. Perhaps others will find them useful .

about derivatives: The paradox of the derivative | Chapter 2, Essence of calculus - YouTube
about derivatives as used in gradient descent: 3.5: Mathematics of Gradient Descent - Intelligence and Learning - YouTube

TMosh · May 27, 2023, 2:57am

The equation for the derivatives doesn’t require a change in any of the parameters.
The dJ/db is all considered when deriving the equation using calculus.
Once you have the equation, you just compute the results.

saifkhanengr · May 27, 2023, 4:59am

Hello @JPGuittard!

Consider reading my this article in which I explained how we move from one point (blue dot, in your case) to another point.

JPGuittard · May 27, 2023, 3:29pm

Thank you for taking the time to recommend this article. I will study it today!

Topic		Replies	Views
Week 2 : Supervised Machine Learning: Regression and Classification Supervised ML: Regression and Classification	11	738	January 7, 2024
Unable to understand Gradient descent intuition Supervised ML: Regression and Classification week-module-1	4	43	February 8, 2025
How to understand gradient descent code? Supervised ML: Regression and Classification week-module-1	2	448	September 2, 2023
Week 2. Why we multiplying by slope instead of dividing? Neural Networks and Deep Learning coursera-platform	4	517	May 14, 2023
Question about Gradient Descent: Modifying Update Rules and Using Derivatives Neural Networks and Deep Learning coursera-platform	3	371	November 16, 2023

Gradient descent formula

Related topics