Momentum without mini-batch

sofi_z · June 17, 2025, 2:07pm

It is clear that if you are using mini-batch, it´s better to use momentum to “smooth” parameters update. But how it works it you don’t use mini-batch (no too much data, so the whole data is one batch)? I can see that momentum clearly does smt, because on practice I get different results if I train the model with RMSProp and Adam optimizers.

rmwkwok · June 19, 2025, 11:37pm

If you plot the two training curves (loss vs epochs) on the same graph, how do they compare?

Cheers,
Raymond

Topic		Replies	Views
Mini-batches and GD with Momentum Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	485	April 23, 2022
Momentum clarification Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	529	July 23, 2021
Course2 week 2 assignment Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	582	August 27, 2021
Optimization algorithms Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	731	April 8, 2023
Week 2 - When to use mini-batch gradient descent Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	551	June 15, 2021

Momentum without mini-batch

Related topics