Week 2 - When to use mini-batch gradient descent

Luiscri · June 14, 2021, 11:25am

According to what I understood on the lessons, mini-batch gradient descent was intended to be used in order to speed up the learning process when we have a great amount of data (talking about millions of entries).

However, on the practice of that week we used it on a m=300 dataset and it also helped to get better results faster.

Should we then consider using mini-batch gradient descent (either with momentum, RMSprop or Adam) no matter the size of our traning set?

AnkitSaini · June 15, 2021, 2:48pm

@Luiscri
In general, it is always better to use mini-batch gradient descent to train your neural network even with a small dataset. This speeds up the training process as weights are updated with each mini-batch.
It also reduces the memory overhead as only a batch is loaded into memory at a time and not the entire training set.

Topic		Replies	Views
Mini-batch Gradient descent Improving Deep Neural Networks: Hyperparameter tun	3	540	August 27, 2022
Gradient steps in Mini batch vs batch Improving Deep Neural Networks: Hyperparameter tun	4	738	May 18, 2021
What is the advantage of Mini batch gradient descent over batch gradient descent? Improving Deep Neural Networks: Hyperparameter tun	1	562	May 17, 2021
Stochastic Gradient Descent Improving Deep Neural Networks: Hyperparameter tun	1	553	June 5, 2021
Batch vs MiniBatch Improving Deep Neural Networks: Hyperparameter tun	1	521	November 1, 2021

Week 2 - When to use mini-batch gradient descent

Related topics