Gradient descent with Momentum New

Anbu · July 30, 2023, 3:20pm

HI Mentor,

If Beta is say very large close to 1, then we are overly smooth out the updates if so what will happen interms of optimization ? can you please help to let know ?

canxkoz · July 30, 2023, 4:45pm

Hello @Anbu ,
Thanks for asking your question on the Discourse community. I am a mentor. I will do my best on my reply to give an answer to your question.

Beta is a parameter used in optimization algorithms such as Adam optimizer, which affects how much the optimizer remembers its previous movements. If beta is very large and close to 1, the optimizer will overly smooth out the updates, which can lead to slower learning. In terms of optimization, this means that the algorithm will take longer to converge to the optimal solution, and may even get stuck in a suboptimal solution.

I hope my answer resolves your question. Please feel free to ask a followup question if you have additional questions related to beta.
Regards,
Can

Topic		Replies	Views
Problem with large beta Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	498	May 4, 2022
Week 2 Quiz Grader did wrong evaluation for 1 question Improving Deep Neural Networks: Hyperparameter tun quiz-help , week-module-2 , grader-feedback , coursera-platform	2	369	May 15, 2024
DL week2: Gradient Descent with Momentum Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	567	June 25, 2023
Effect of beta on the contour plot of gradient descent with momentum (questions asked in quiz2) Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	569	December 19, 2021
DLS C2W3 Momentum vs Adam Beta Hyperparameters Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	515	April 11, 2023

Gradient descent with Momentum New

Related topics