The best mini-batch size usually not 1 and not m, but instead something in-between

Topic mini batch gradient descent

Yes, if you are making the statement that you put into the title here, then I think that is a true statement. Prof Ng does discuss this in the lectures on minibatch gradient descent. There is also the famous Yann LeCun quote on this subject: “Friends don’t let friends use minibatch sizes larger than 32”.

Here’s another thread which discusses this point in a bit more detail.

1 Like

Seems to me a tautology…if it’s size 1 it’s not a batch, and if it’s size m it’s not mini. Ergo, 1 < best < m

:nerd_face:

3 Likes