A small bug in the first week's test

Nitish_Satya_Sai_Ged · January 30, 2023, 9:13pm

{quiz solution removed by mentor as we can’t share it here}
Option D is also obviously right in the first question if the alpha is zero. So it’s better to include the alpha value when asking this question.

rmwkwok · January 30, 2023, 10:42pm

Hello @Nitish_Satya_Sai_Ged,

When we train a neural network, we need a positive learning rate to make it work. The gradient, however, can be positive, negative, or zero.

Raymond

Nitish_Satya_Sai_Ged · January 31, 2023, 12:54am

Hi Raymond,

Thanks for your reply. I understood. The alpha should be between 0 and 1, but it can’t be either 0 or 1.

rmwkwok · January 31, 2023, 1:03am

The alpha can be any positive number and be larger than 1, but you need to verify which alpha value is actually beneficial to the training process. Setting it to zero won’t do us any updating, right? Even if the program does not complain for a zero learning rate (\alpha), we wouldn’t do it that way at all.

w := w - \alpha\frac{\partial{J}}{\partial{w}}

Nitish_Satya_Sai_Ged · January 31, 2023, 1:15am

Hi Raymond,
But Andrew said it should range from 0 to 1 in the class. Of course, it shouldn’t be zero.

rmwkwok · January 31, 2023, 1:18am

Can you share the source of that? Which time in which video? I think it is important to provide the source when quote.

Thanks,
Raymond

saifkhanengr · January 31, 2023, 3:52am

Hello @Nitish_Satya_Sai_Ged!

Initially, it is recommended to choose alpha with a small value, between 0 and 1. Then increase or decrease it accordingly as @rmwkwok also mentioned that alpha can be greater than one. Prof. Andrew explains this in the video below.

https://www.coursera.org/learn/machine-learning/lecture/10ZVv/choosing-the-learning-rate

rmwkwok · January 31, 2023, 4:41am

Thank you @saifkhanengr!

@Nitish_Satya_Sai_Ged, if we look at the above slide from the video @saifkhanengr has kindly shared with us, we see that Andrew is only suggesting us to try those values in a manner of “3x” of the previous ones.

Under proper feature normalization (covered in Course 1 Week 2), and when dealing with fully connected neural networks (which is covered in this MLS), usually you can find a good learning rate in this range, but in case you can’t, you will still need to explore outside of this range.

If this is not the video you were watching, and you would like us to discuss another video, please just feel free to share it with us. It is very effective to clear up a question in where it began to show up.

Cheers,
Raymond

Nitish_Satya_Sai_Ged · February 16, 2023, 3:21pm

Dear mentors,

I want to thank both of you. I was asked precisely about this particular side. And I apologize for not getting back to you sooner. Thank you so much for your support. I have one more follow-up question. As @saifkhanengr said, it could be greater than 1. When do we get such situations to make learning rates more significant than one? And one more question related to contour plots, I have some problems with my intuition. As Prof. Andrew said we would slide from the top of the mountain to valley points in such a way as to reach global minima. but when it comes to the contour plot, we are trying to find the minimum cost on the small circle which typically portrays the top of the mountain where we start sliding to find the global minimum, so this intuition contradicts my thinking, please help me to understand this a bit better.

rmwkwok · February 17, 2023, 1:54am

Hello @Nitish_Satya_Sai_Ged,

I think the purpose of showing the mountain is just to convey the idea about contours. Can you see the similarity in the concept of contour between a mountain and a cost surface?

In the case of a cost surface, we want to get to the lowest point. In the case of a hiking in a mountain, we want to get to the highest point.

The idea of “contour” is the same. The target is not the same.

I don’t have an example for it now. If we want an example, we usually try it out ouselves with some datasets and some achitectures.

Cheers,
Raymond

Nitish_Satya_Sai_Ged · February 17, 2023, 2:30pm

@rmwkwok , I understood. Thanks for your reply.

Topic		Replies	Views
Optional Lab: C1_W2_Lab03_Feature_Scaling_and_Learning_Rate_Soln Supervised ML: Regression and Classification week-2	2	511	October 10, 2022
Optional Lab C1_W2_Lab03 - Learning rate choice Supervised ML: Regression and Classification week-2	5	491	October 5, 2022
Advice for selecting the right value for alpha in Gradient Descent Supervised ML: Regression and Classification week-3	5	1528	January 7, 2023
Questions on Exercise 7&8 of Week3 Programming Assignment Neural Networks and Deep Learning week-3 , coursera-platform	5	234	April 6, 2024
Course 1 Week 1 Lab04: Gradient descent - Ideal initial value of learning rate and number of iterations Supervised ML: Regression and Classification week-1	3	30	May 4, 2025

A small bug in the first week's test

Related topics