Week 2 final lab

flyunicorn · June 7, 2025, 7:46am

Here why it is always the batch number 157 got executed not the other batches? Shouldn’t 40 out of 157 batches be randomly selected during the 40 runs?

conscell · June 7, 2025, 8:25am

Hi @flyunicorn,

Each epoch completes after all 157 batches have been processed. The number displayed corresponds to the final batch.

flyunicorn · June 9, 2025, 8:07am

then what is the benefit of splitting the set into 157 parts if all examples will be used in 1 epoch anyway?

conscell · June 9, 2025, 11:52am

In full-batch gradient descent, the model processes the entire training set before updating the weights, resulting in accurate but slow updates and high memory usage. In contrast, mini-batch gradient descent splits the data into smaller batches (157 in this lab), allowing the model to update weights more frequently (once per batch) using less memory and making faster progress. Each mini-batch gives a slightly noisy estimate of the true gradient which can act as a form of regularization.

Topic		Replies	Views
Mini-batch understanding Improving Deep Neural Networks: Hyperparameter tun coursera-platform	8	678	March 7, 2023
Confusion Regarding Week 2 Video - 'Understanding Mini batch Gradient Descent' Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	569	October 27, 2021
Gradient steps in Mini batch vs batch Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	800	May 18, 2021
Gradient Descent [Logistic Regression] Neural Networks and Deep Learning coursera-platform	3	409	August 17, 2023
Why is Mini-batch Gradient Descent more efficient? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	556	December 24, 2022

Week 2 final lab

Related topics