[Course 5 - Week 2 - Assignment 2] Why not "shuffle" the training data in SGD?

TRAN_KHANH1 · February 16, 2023, 5:18pm

In Week 2’ assignment 2, the model doesn’t shuffle the data when training:

def model(X, Y, word_to_vec_map, learning_rate = 0.01, num_iterations = 400):
 """
    Model to train word vector representations in numpy.

Is it good to have the training data shuffled for each iteration of SGD?

Also in Week 1’s assignment #2 (Dinosaur Island-Character-Level Language Modeling), the model does contain the code for shuffling the dataset but this is done before entering the optimization loop:

# Shuffle list of all dinosaur names # [NOTE]
np.random.seed(0)
np.random.shuffle(examples)
  
  # Optimization loop
  for j in range(num_iterations):

Does it have any effect in this case?

rmwkwok · February 17, 2023, 2:49am

Hello @TRAN_KHANH1,

I think in short it is hard to tell whether we have to keep shuffling it. Let’s try to think through this: if we do, then we have different sets of mini-batch from an epoch to another epoch, otherwise, we have the same set. The effect of two different mini-batches is that one of them might better lead us to the underlying best set of training parameters. However, we won’t know in advance which mini-batch will be the better one. This is why I said it is hard to tell.

Moreover, with or without shuffling, we already have many mini-batches in an epoch, so the total effect of them is even more unpredictable.

Therefore, if we really want to know, we just need to try it out ourselves. This time it might happen to make a difference, but next time it does not.

Cheers,
Raymond

Topic		Replies	Views
Train data shuffle prior to training NLP with Classification and Vector Spaces week-1	2	510	June 28, 2022
Why Shuffle Sequential Data Sequences, Time Series and Prediction week-2	4	755	November 9, 2023
Question about suffleing time series data for classification problems Sequences, Time Series and Prediction week-2	2	522	June 8, 2022
C5, W1, A2 - dinosaurs character language modelling Sequence Models coursera-platform	3	628	January 2, 2023
Assignment 3 - UNQ_C1 - data_generator NLP with Sequence Models week-3	1	273	December 1, 2023

[Course 5 - Week 2 - Assignment 2] Why not "shuffle" the training data in SGD?

Related topics