Gradient descent formula

I’m not a mathematician and I’m trying to get an intuitive sense of the math here.
Is this a correct alternative explanation to the formula?
Imagine a blue dot that marks our position.
Alpha ( the learning rate) represents how far along you move that blue dot on the horizontal axis (w)
and the derivative of the cost function represents the slope from the prior position of the blue dot to the new position of the blue dot? (and because it’s a derivative it assures that our blue dot stays on the curve instead of ending up somewhere else. )

1 Like

Yes, that seems like a good analogy.

Is it the case that the derivative can’t be calculated until we update b (or w) in the gradient descent formula

No, the derivative is calculated based on the current ‘b’ and ‘w’ values.
Then you use the derivative to compute new ‘b’ and ‘w’ values.

OK. But a derivative implies a change, and there is no change in b if I use the current value of b.
Sorry about all of my questions. I’ll study derivatives tonight until I have a handle on this.

It’s been years since I had calculus, so watched these videos and found them helpful to get an understanding of derivatives in general and then derivatives as used in the gradient descent algorithm. Perhaps others will find them useful . :slightly_smiling_face:

about derivatives: The paradox of the derivative | Chapter 2, Essence of calculus - YouTube
about derivatives as used in gradient descent: 3.5: Mathematics of Gradient Descent - Intelligence and Learning - YouTube

The equation for the derivatives doesn’t require a change in any of the parameters.
The dJ/db is all considered when deriving the equation using calculus.
Once you have the equation, you just compute the results.

Hello @JPGuittard!

Consider reading my this article in which I explained how we move from one point (blue dot, in your case) to another point.

Thank you for taking the time to recommend this article. I will study it today!