Confusion Regarding Week 2 Video - 'Understanding Mini batch Gradient Descent'

Zeeshan_Rizvi · October 18, 2021, 4:48am

During the lecture and while explaining, ‘Choosing your mini batch size’, Sir Andrew mentions that when the mini batch size is equal to ‘m’ (number of training examples), it is equal to batch gradient descent, and when the mini batch size is equal to 1, it equals to stochastic gradient descent. I feel a bit confused over this, as the explanation that follows implies the other way around, that is when the mini batch size is equal to ‘m’, it should be equal to stochastic gradient descent, and with a mini batch size of 1, it should be equal to gradient descent. Can you please check and clarify if this is an error in the course, or I am not understanding it correctly?

SomeshChatterjee · October 18, 2021, 11:32am

Hi Zeeshan,

Welcome to the community. Your initial understanding is correct:

Mini-Batch size = 1 then it’s equivalent to Stochastic Gradient Descent
Mini-Batch size = m then it’s equivalent to Batch Gradient Descent

Can you elaborate which part of the video led to the confusion?

Zeeshan_Rizvi · October 20, 2021, 7:49am

I was having some trouble understanding what exactly the term ‘mini batch size’ is representing. I was confused if it was representing the number of training examples in one batch, or the total number of mini batches which will be formed. Its all clear now. The term is representing the number of training examples in one mini batch. Thank you!

scoutant · October 20, 2021, 6:56pm

Hi Somesh.
I understand that when size = m we have batch gradient descent.
When size=1, we will take only one example to compute the gradient descent. Once done we will repeat with the next example and so on. We are not choosing randomly : we choose exactly the next example. Not randomly :
so I am confused with the terminology stochastic…
If you can help. Regards

SomeshChatterjee · October 24, 2021, 3:33pm

Hi Stephane,

That’s an excellent question and unfortunately I don’t have a good answer to it. Based on what I know in mini-batch gradient descent most libraries (like tf.keras) will shuffle the data by default before selecting the mini-batches (you can explicitly turn it off). This is similar to choosing samples at random without replacement.

I know this isn’t very convincing and I apologize. Let me check with other mentors for a better response.

Thanks

paulinpaloalto · October 24, 2021, 5:08pm

I believe it is true that it is normal practice to randomly shuffle the order of the samples before each Epoch when doing any flavor of minibatch GD (SGD or not). But the real point is that the stochastic behavior does not arise from randomly choosing the sample on a given iteration: it occurs because the samples are a statistical distribution, right? When minibatch size > 1, then you are averaging the gradients over the samples in each minibatch, so there is some statistical smoothing effect from that. In the case of batch size 1, the gradients can jump all over the place because you’re getting no smoothing from averaging the gradients across more than one sample. Every sample is different, right? The point of minibatch gradient descent is that you update the weights after every minibatch, so in the case of SGD the updates to the weights are more stochastic because you lose the averaging effect.

SomeshChatterjee · October 27, 2021, 3:42am

Thanks Paul for the informative reply.

Topic		Replies	Views
Mini-batch understanding Improving Deep Neural Networks: Hyperparameter tun	8	669	March 7, 2023
Gradient steps in Mini batch vs batch Improving Deep Neural Networks: Hyperparameter tun	4	791	May 18, 2021
week2:Optimization_methods Improving Deep Neural Networks: Hyperparameter tun	4	352	October 15, 2023
Mini Batch Gradient Descent vs Batch GD Improving Deep Neural Networks: Hyperparameter tun	2	559	May 23, 2021
W2 Exercise 2 random_mini_batches Improving Deep Neural Networks: Hyperparameter tun	8	493	June 26, 2024

Confusion Regarding Week 2 Video - 'Understanding Mini batch Gradient Descent'

Related topics