Shouldn't gradient descent follow a path perpendicular to contour lines?

ptschanz · December 27, 2022, 2:44pm

From my understanding, gradient descent is expected to follow a path perpendicular to the contour lines from the cost function. I understand that the learning rate will influence the path taken, e.g., a too high learning rate will lead to a zigzag path, whereas a small learning rate will lead to a smoother path towards the minimum of the cost function. But the vector from one step to the next should be perpendicular to contour lines. However, in the optional lab 4 on gradient descent, the visualization of the path gradient descent takes is not perpendicular to the contour lines:

Is the visualization function incorrect and gradient descent path is indeed perpendicular to contour lines, or if the visualization is correct, why is the path not perpendicular to contour lines?

pastorsoto · December 27, 2022, 3:00pm

Hi @ptschanz Great question!
In general, the path that gradient descent follows towards the minimum of the cost function is expected to be perpendicular to the contour lines of the cost function. This is because the gradient of the cost function at a particular point represents the direction of steepest descent, and the gradient is perpendicular to the contour lines at that point.

However, it’s important to note that the path that gradient descent follows may not always be exactly perpendicular to the contour lines, especially if the learning rate is not well-tuned. If the learning rate is too large, gradient descent may overshoot the minimum and oscillate around it, resulting in a zigzag path that is not necessarily perpendicular to the contour lines. On the other hand, if the learning rate is too small, gradient descent may take a very long time to converge, resulting in a smooth path that may not always be perpendicular to the contour lines.

As for the visualization you mentioned, it’s possible that the visualization function is not showing the path taken by gradient descent exactly as it occurred, but rather as a smoothed version of the path for the purpose of visualization. In any case, the key takeaway is that the path taken by gradient descent should generally be in the direction of steepest descent, which is perpendicular to the contour lines at each point.

Christian_Simonis · December 27, 2022, 3:48pm

Hi there,

in addition: dependent on the solver also „momentum“ next to other solver characteristics could also be an influencing factor in the course of gradient descent, (e.g. in case of Adam). See also: Why not always use Adam optimizer - #4 by Christian_Simonis

Best regards
Christian

TMosh · December 27, 2022, 3:48pm

The graph is just a simplified sketch. Your intuition about the expected path is correct - or at least sufficiently correct for Week 1 of an introduction to Machine Learning.

AbdElRhaman_Fakhry · December 27, 2022, 4:34pm

Hi @ptschanz

In addition to what all mentors said the gradient descent is the basic of the converge algorithms to tune weights and also depend on the learning rate and if the graph of the cost with weight is elliptic not a circular or the gradient oscillate around the global minimum cost there are another optimization algorithm will be powerful like adam or momentum algorithm and if you want to learn more about these algorithm i advice you after completing this specialization …start with deep learning specialization will will learn you more and more about fantastic algorithms

Thanks!
Abdelrahman

ptschanz · December 28, 2022, 1:08pm

Hi @pastorsoto @Christian_Simonis @TMosh @AbdElRhaman_Fakhry

Thank you very much for all your replies. I understood the part about overshooting, but I expected that when I decrease the learning rate, the path would more closely follow a path perpendicular to contour lines. But this was not the case when I run gradient descent with different learning rates and using the provided plotting function, so I started doubting my intuition.

In this case, I’ll settle for general intuition at this time, but will keep this in mind for later when I look more deeply into various optimizers and when taking the deep learning specialization.

Thanks again!
Philippe

neirar · February 1, 2024, 10:07am

If you use the zoomed-in version of the contour plot with the gradient descent path, then you can see the path is perpendicular to the contour lines. Not sure why the first plot (not zoomed in) doesn’t show the path being perpendicular to the contour lines.

I added screenshots in this post: Should gradient descent create a path that is always perpendicular to contour lines?

TMosh · February 1, 2024, 10:46pm

I suspect this is a quirk in how the path is plotted. You can read the code for that function by opening the utility file via the Files menu.

Topic		Replies	Views
Should gradient descent create a path that is always perpendicular to contour lines? Supervised ML: Regression and Classification week-1	7	580	April 9, 2024
C1W1 Lab04, why aren't the gradients perpendicular to the contour lines? (SOLVED) Supervised ML: Regression and Classification week-1	13	447	March 15, 2024
C1_W3_Lab06: meandering gradient descent Supervised ML: Regression and Classification week-2	17	30	November 26, 2024
Week 2 Lab 3 Question About Feature Scaling Supervised ML: Regression and Classification week-2	1	288	January 8, 2024
Why normalization helps Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	545	July 20, 2023

Shouldn't gradient descent follow a path perpendicular to contour lines?

Related topics