Is generating of data the same as in the last assigment?

I want to ask, is everything that we do for the generating of data in assigment C3_W3_Assignment.ipynb the same that we will do in the last assigment of attention course C4_W4_Assignment?

# trax allows us to use combinators to generate our data pipeline
data_pipeline =
    # randomize the stream,
    # tokenize the data,
    # filter too long sequences,
    # bucket by length[128, 256,  512, 1024],
                             batch_sizes=[16,    8,    4,   2, 1]),
    # add loss weights but do not add it to the padding tokens (i.e. 0)

# apply the data pipeline to our train and eval sets
train_stream = data_pipeline(stream(train_data))
eval_stream = data_pipeline(stream(eval_data))

so, can we delete that huge func data_generator(batch_size, x, y, pad, shuffle=False, verbose=False) from C3_W3_Assignment.ipynb if we use this code above, for example?

Just can delete I think. But maybe it can be usefull it some cases too.

This topic was mentioned here and so I’m adding you folks.


Happy mentoring.