The best mini-batch size usually not 1 and not m, but instead something in-between

Ahmad_Khalid1 · March 10, 2024, 3:00pm

Topic mini batch gradient descent

paulinpaloalto · March 10, 2024, 4:11pm

Yes, if you are making the statement that you put into the title here, then I think that is a true statement. Prof Ng does discuss this in the lectures on minibatch gradient descent. There is also the famous Yann LeCun quote on this subject: “Friends don’t let friends use minibatch sizes larger than 32”.

Here’s another thread which discusses this point in a bit more detail.

ai_curious · March 11, 2024, 11:52am

Seems to me a tautology…if it’s size 1 it’s not a batch, and if it’s size m it’s not mini. Ergo, 1 < best < m

Topic		Replies	Views
Course 4, Week 1, Programming Assignment 2 Convolutional Neural Networks coursera-platform	3	547	April 3, 2022
Confusion Regarding Week 2 Video - 'Understanding Mini batch Gradient Descent' Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	597	October 27, 2021
What is the advantage of Mini batch gradient descent over batch gradient descent? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	583	May 17, 2021
Week 2 Quiz question 2 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	577	July 30, 2023
What is the main benefit of minibatch size Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	556	May 9, 2021

The best mini-batch size usually not 1 and not m, but instead something in-between

Related topics