Hi

I am revising this section after completing Part 1and II and have kinda understood the nature of MSE cost function to be an inverse bell curve, I understand how a Gradient descent formula can find its minima minima.

However in a 3D hill, how does the G.D formula understands which *direction* of the 360 degrees to take. From what I could visualize , it just takes any direction and keeps going down.

Can you pls explain me this section of video from 4:58 - 5:24?

Thank You,

Venkat S

Time : 4:58 - 5:24

Gradient descent | Coursera

Your goal is to start up here and get

to the bottom of one of

these valleys as efficiently as possible.

What the gradient descent algorithm does is,

youâ€™re going to spin around 360 degrees

and look around and ask yourself,

if I were to take

a tiny little baby step in one direction,

and I want to go downhill as quickly

as possible to or one of these valleys.

What direction do I choose to take that baby step?

Well, if you want to walk down

this hill as efficiently as possible,

it turns out that if youâ€™re standing

at this point in the hill and you look around,

you will notice that the best direction to take

your next step downhill is roughly that direction.

Mathematically, this is

the direction of steepest descent.

It means that when you take a tiny baby little step,

this takes you downhill faster than

a tiny little baby step you could

have taken in any other direction.

After taking this first step,

youâ€™re now at this point on the hill over here