The best mini-batch size usually not 1 and not m, but instead something in-between

paulinpaloalto · March 10, 2024, 4:11pm

Yes, if you are making the statement that you put into the title here, then I think that is a true statement. Prof Ng does discuss this in the lectures on minibatch gradient descent. There is also the famous Yann LeCun quote on this subject: “Friends don’t let friends use minibatch sizes larger than 32”.

Here’s another thread which discusses this point in a bit more detail.

Topic		Replies	Views
Course 4, Week 1, Programming Assignment 2 Convolutional Neural Networks coursera-platform	3	547	April 3, 2022
Confusion Regarding Week 2 Video - 'Understanding Mini batch Gradient Descent' Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	597	October 27, 2021
What is the advantage of Mini batch gradient descent over batch gradient descent? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	583	May 17, 2021
Week 2 Quiz question 2 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	577	July 30, 2023
What is the main benefit of minibatch size Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	556	May 9, 2021

The best mini-batch size usually not 1 and not m, but instead something in-between

Related topics