Learning Rate - C1_W2_Lab03

Alaa_Elshorbagy · February 15, 2023, 4:21pm

Hi, I have a question regarding example mentioned in the 3rd optional lab in week 2.
For 𝛼= 9e-7, the value of w_0 oscillate around the minimum, and the cost function didn’t get below 40000 till the 6th iteration.
For 𝛼 = 1e-7, the value of w_0 didn’t oscillate, and the cost function is decaying much faster.
However the notes states different " On the left, you see that cost is decreasing as it should. On the right you can see that 𝑤_0 is decreasing without crossing the minimum. Note above that dj_w0 is negative throughout the run. This solution will also converge, though not quite as quickly as the previous example."
So I am confused now. Could you explain more for me?
A.

AbdElRhaman_Fakhry · February 15, 2023, 5:10pm

Hi @Alaa_Elshorbagy
Welcome to the community!

In this assignment show you how the best learning rate 𝛼 will behavie, how the cost would be,also the change value of each iteration what it would be
First what is the learning rate: the learning rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function .
In the image below learning rate is 𝛼 = 9e-7 :

The changed value(derivative of cost according to weight w0) it change(oscillate) one positive, one negative like what I highlighted it always converge and it’s very speed, the learning rate here is very good because:

The cost converge(decrease) in every iteration
The rate(speed) of converge is so good
Wouldn’t diverge after some iteration

But
In the image below the learning rate is 𝛼 = 1e-7 it’s a bit smaller value from the previous learning rate alghtough the cost converge(decrease) in every iteration, But the speed of converge isn’t as good as the above image

So breifly we search about the value of learning rate 𝛼 which:

Always converge in every iterations
The speed of converge is the highest
Would diverge after some iteration

Note that there are a great techniques called learning rate decay(or learning rate schedule) which is after some iterations the learning rate start to decrease to fit the small change we will make to the weights

Cheers,
Abdelrahman

Alaa_Elshorbagy · February 16, 2023, 2:50pm

Thank you Abdelrahman for your reply. How do you know if the speed/rate of convergence is good?

As you can see below the learning curve is flattening by the 60 iteration:

While here, the learning curve flatten by the 10th iteration, which mean it is faster!! what am I getting wrong?

Cheers
Alaa

Kranthi_Kiran_Jalaka · April 18, 2023, 1:30am

I am confused about this as well. Were you able to figure out the problem? Because that statement did not make sense to me when I read it. (“though not as quickly as the previous example”).

Riya_Parikh17 · April 18, 2023, 7:54pm

When alpha was more, it decreased rapidly but didn’t reach minima quickly as it got skipped as it were giant steps due to which although it was decreasing rapidly, it didn’t reach minima soon as it had to oscillate at the same place many times. While when there is less alpha, although it decreases slowly but reaches minima in the first go which it took comparatively lesser time

Amaia_Elizaran · April 26, 2023, 10:58am

Hi!
I was also confused by the notes on the optional lab. But I understood what they mean with this reply so thank you
If I got it correctly, the decreasing velocity is given by djdw0 for this example and when it oscillates (alpha=9e-7), we see that this values is larger for every iteration than when it does no oscillate (alpha=1e-7).
However, we are not only looking for the quickest decreasing velocity, but also for the fastest convergence. Therefore, although alpha=9e-7 gives us a faster decreasing rate, alpha=1e-7 converges faster to the minimum.
Am I right?
If that is the case I think that the sentence “This solution will also converge, though not quite as quickly as the previous example.” in the notes is quite confusing.
Thank you!

Riya_Parikh17 · April 26, 2023, 11:15am

Yes, it may seem to be a bit confusing in beginning, you have understood correctly.

Topic		Replies	Views
Description Mistake in Week 2 Practice Lab 3 Feature Scaling and Learning Rate Supervised ML: Regression and Classification week-2	3	280	December 19, 2023
Question regarding learning rate graph from W2 logistic regression lab Neural Networks and Deep Learning coursera-platform	3	653	July 28, 2023
MLS C1 W1 About the Learning rate course Supervised ML: Regression and Classification week-1	4	578	July 15, 2022
Learning rate on Regularization Supervised ML: Regression and Classification week-3	5	338	December 21, 2023
Week 2 Optional Lab Learning Rate Supervised ML: Regression and Classification week-2	3	518	July 18, 2022

Learning Rate - C1_W2_Lab03

Related topics