Optimising the learning rate alpha

ai_is_cool · December 3, 2024, 5:20pm

Has anyone ever considered making the learning rate alpha adaptive to optimise convergence of the cost function to a global mininum?

Its occured to me that if initial values of the parameter vector w can be determined by first finding w which make the cost function a maximum then setting alpha to a high value like 0.8. Then as the cost function converges quickly to the global minimum with a large alpha, compute the derivative vector and reduce alpha as the absolute value of the derivative gets smaller. This way, the cost function should reach its global minimum the quickest possible way wthout overshooting the minimum or taking too many iterations.

SNaveenMathew · December 3, 2024, 6:33pm

Most APIs like scikit-learn that implement linear/logistic regression have implementations that allow you to set an initial learning rate and to tweak the learning rate ‘schedule’ for gradient descent. Your intution is correct for convex loss functions with a distinct optimal solution - initially the weights are ‘far away from the optimal weights’, so the weight updates may take us closer to the optimal weights if the learning rate is set to a higher value (as long as you don’t overshoot the optima after the first weight update). However, the intuition is not valid if the loss (vs weights) behaves differently. There are other solutions like momentum that perform better on not-so-well-behaved loss functions.

ai_is_cool · December 3, 2024, 7:06pm

What is “scikit-learn”?

SNaveenMathew · December 3, 2024, 7:31pm

Scikit-learn is a python package that’s used to build most machine learning models. The API makes it easy to train models on any data set.

TMosh · December 4, 2024, 8:17am

Yes. One method is called the “Adam optimizer”.
https://pytorch.org/docs/stable/generated/torch.optim.Adam.html

ai_is_cool · December 4, 2024, 4:32pm

Thank you. That sounds very interesting.

Topic		Replies	Views
Advice for selecting the right value for alpha in Gradient Descent Supervised ML: Regression and Classification week-module-3	5	1594	January 7, 2023
Learning Rate - C1_W2_Lab03 Supervised ML: Regression and Classification week-module-2	6	542	April 26, 2023
Learning rate. - course notes Neural Networks and Deep Learning week-module-2 , coursera-platform	10	321	February 2, 2024
Dynamic adjustment of the learning rate Supervised ML: Regression and Classification week-module-2	2	630	August 27, 2022
Why it will overshoot and never reach the minimum? (The point getting away from lowest point?) Supervised ML: Regression and Classification week-module-1	4	40	November 11, 2024

Optimising the learning rate alpha

Related topics