C3W3_Assignment: clarification required on NUM_BATCHES

Hello,

Can you clarify if the assignment is expecting to get

  • a NUM_BATCHES of batches out of each of the train and validation datasets, or
  • train and validation datasets each with a batch size of NUM_BATCHES?

If it is the first option, then how do you expect to get 1125 and 125 as len(train_dataset) and len(validation_dataset)? Shouldn’t they be 128 and 128 batches respectively?

If it is the second option, then that requirement would be better if it were reworded to expect batch sizes of BATCH_SIZE. Currently it is worded as:

Turn the dataset into a batched dataset with num_batches batches

From the markdown, it’s clear that the dataset for this assignment contains 160_000 data points. When we do a train test split of .9, train split will contain 160_000 * .9 = 144_000 data points and the test split will contain 160_000 * .1 = 16_000 data points.

With NUM_BATCHES = 128, each batch contains 128 data points. This leaves the number of batches in the train split as 144_000 / 128 = 1125 and test split as 16_000 / 128 = 125.

1 Like

Its this one. The total number of batches created out the datasets!

1 Like

Thanks for the response @balaji.ambresh and @gent.spah

All through the Lab exercises, the variable BATCH_SIZE has been consistently used to set the batch size.

Therefore, in the Assignment, with the variable NUM_BATCHES along with the instruction “Turn the dataset into a batched dataset with num_batches batches”, the direction is clear that it is not about the batch size, instead to create that number of batches.

It appears that the unittest implementation is inconsistent with the requirement. And thus, the implementation is forced to follow the unittest validation through the whole assignment.

It’s up to the DLAI team to decide if they want to update and clear up the confusion.

Thanks for bringing this up.
The staff have been asked to rename NUM_BATCHES to BATCH_SIZE.

I guess the ideal statement can be

Convert the batched dataset into batches using NUM_BATCHES