I don't really understand how padded_batch() works

Legends123 · May 18, 2022, 1:26pm

Hi everyone,

Could someone pls explain to me how padded_batch() works…?

I’m currently at this place in this Week 2’s workbook and I don’t really understand how padded_batch() works… The notebook says that it is used to “Batch and pad the datasets to the maximum length of the sequences” but how does this works. How does the maximum length of the sentences equal to this “64”? and what is the unit of this “64”?

I understood padding up until last lesson about how padding is to help adding zeros as part of tokenification but I just don’t know what effect does this padded_batch() do… My guess it that it makes each sequences have 64 values?

Thank you!

balaji.ambresh · May 18, 2022, 2:20pm

Have you seen this link on padded_batch ?

Here’s one way to inspect train_dataset:

it = iter(train_dataset)
# look at 5 batches
for i in range(5):
  x, y = next(it)
  print(x.shape, y.shape)

Topic		Replies	Views
C3_W3_Lab_2_multiple_layer_LSTM Natural Language Processing in TensorFlow	1	343	October 14, 2022
Does only transformer need padding using max_length? Sequence Models coursera-platform	8	923	March 8, 2023
C3_W4 Assignment: Padding in excercise 2 NLP with Sequence Models week-module-4	6	530	March 6, 2023
How to tokenize data for NER NLP with Sequence Models week-module-3	1	393	September 23, 2023
Get_padded_sequences Natural Language Processing in TensorFlow week-module-1	6	558	December 23, 2022

I don't really understand how padded_batch() works

Related topics