Gradient descent learning rate

nitincheekatla · January 15, 2024, 9:01pm

I have doubt in Gradient descent week2. How do we assume learning rate ?. In video " Optimization using Gradient Descent in one variable - Part 2". Prof says to assume learning rate as 0.005. In reality as in solving the problems how do we assume what should be the learning rate ?. I’m kind of stuck here.

arosacastillo · January 15, 2024, 9:37pm

The learning rate determines the size of the steps taken during optimization, and it plays a significant role in the convergence and performance of the model. It’s essential to note that the optimal learning rate can depend on the specific problem, architecture, and dataset. Therefore, it’s often a good idea to experiment with different approaches and choose the one that works best for your particular scenario.

Some common strategies for selecting a learning rate that you can read more about in internet are:

Adaptive Learning Rate Methods:

Adaptive learning rate methods, such as Adam, RMSprop, and Adagrad, dynamically adjust the learning rate during training based on past gradients. These methods can adapt to the geometry of the loss landscape.

Learning Rate Schedules:

Instead of using a fixed learning rate throughout training, you can use learning rate schedules. This involves starting with a higher learning rate and gradually reducing it as training progresses.
Common learning rate schedules include step decay, exponential decay, and 1/t decay, where t is the iteration or epoch number.

Grid Search:

You can perform a grid search over a range of learning rates. Train your model with different learning rates and evaluate their performance. Choose the learning rate that gives the best results.
Typically, you might try values like 0.1, 0.01, 0.001, etc., and observe how the model performs.

Happy learning,

Rosa

TMosh · January 15, 2024, 9:38pm

The learning rate is determined by experiment.

If it’s too large, then the cost solution will diverge toward +Infitity.
If it’s too small, then convergence will take too long.

nitincheekatla · January 15, 2024, 9:44pm

Thank you guys

Topic		Replies	Views
Dynamic adjustment of the learning rate Supervised ML: Regression and Classification week-2	2	604	August 27, 2022
Course 1 Week 1 Lab04: Gradient descent - Ideal initial value of learning rate and number of iterations Supervised ML: Regression and Classification week-1	3	28	May 4, 2025
Question regarding learning rate graph from W2 logistic regression lab Neural Networks and Deep Learning	3	652	July 28, 2023
Learning rate. - course notes Neural Networks and Deep Learning week-2	10	319	February 2, 2024
Learning rate alpha (Gradient descent) Supervised ML: Regression and Classification week-2	8	485	October 25, 2022

Gradient descent learning rate

Related topics