An important miss from instructors

vaidabhishek · September 18, 2024, 5:42pm

I was doing the graded assignment for week #1 ( Programming Assignment: Deep N-grams) when I stumbled upon a rather glaring miss from instructors on an topic of how datasets manage batching operations.

Concretely, while implementing create_batch_dataset function, it was very hard for me to unearth an importnat detail about batching. In this function we are being asked to batch our input sequences twice: 1.) to make sequence of words and 2.) to make batches of those sequences. In the middle of this two operations, we also convert the the sequence (after 1st batching) to a tuple ([:-1], [1:]) of a sequence. Doing this changes the shape of each entry of the dataset from [ (17,), (17,) …] to [ ((16,), (16)), ((16,), (16)), … ]

Now, intuitively anyone would think that when the next batching is done, it’ll just collect these tuples into batch size and that’s how dataset generator pattern would work.

But that isn’t so. During the second batching operation, the tf is unpacking these tuples and collecting all the first elements together and all the second elements together. This means, the shape isn’t as you would hope.

To take an example, let’s say our second batch size id 3. Then 1 would hope that running .take(1) (get 1 batch) would yield something like: a sequence of 3 tuples [ ( (16,), (16,) ), (), () ). But that is not the case. Confusingly enough the return of .take(1) would be a tuple of 2 tensors, where first tensor would accumulate all 3 first entries and 2nd element would accumulate all 3 second entries.

It took me several hours to figure this out and I think instructors should have explained this very clearly to avoid wasting effort here. Here’s a chatgpt explanation for this.

====

You’re absolutely correct in your observation that tf.data.Dataset.batch() does more than simply grouping elements together. When a dataset contains tuples, tf.data.Dataset.batch() treats each component of the tuple separately and batches them independently. This behavior is deliberate and built into TensorFlow’s batch() function to support common machine learning use cases like batching input-target pairs.

Topic		Replies	Views
Natural Language Processing Specialization: Course 3(Sequence Models) Week 1 Assignment Code Issue NLP with Sequence Models week-1	4	456	April 8, 2024
AssertionError with create batch dataset Course 3 week 1 assignment NLP NLP with Sequence Models week-1	12	517	January 26, 2024
Understanding Dataset processing Sequences, Time Series and Prediction week-4	2	599	April 6, 2023
C3W1_Assignment_Deep_N-grams_Exercise 2 - create_batch_dataset NLP with Sequence Models week-1	12	362	September 17, 2024
Getting an error in C3W1, Deep N grams in batch dataset generator NLP with Sequence Models week-1	10	71	August 30, 2024

An important miss from instructors

Related topics