C3_W3_Lab_2_multiple_layer_LSTM

Varun_bachalli · October 14, 2022, 10:57am

In the dataset that is downloaded. Why do we use padded_batch for the dataset rather than using pad_sequences ?

When using padded_batch, the lengths of the sequences are only padded to the maximum length of the batch (and not the whole dataset). Which means each individual batch will have different sequence lengths.

When using a Bidirectional LSTM for text classification, is it not important how long each individual sequence is? i.e. is the number of iterations of the LSTM Cell not something that’s important?

balaji.ambresh · October 14, 2022, 5:25pm

When using pad_sequences, you are wasting computation resources across small sentences since the LSTM can quickly figure out EOS -> EOS token mapping from current to next timestep.

In order to get around this problem, it’s sufficient to train the lstm till the length of the longest sentence in the batch.

If you have 1 sentence with length 30 and the maximum sentence length of rest of the dataset is 6 and say you have 10000 sentences, training will be faster with padded_batch.

Topic		Replies	Views
I don't really understand how padded_batch() works Natural Language Processing in TensorFlow week-module-2 , week-module-3 , week-module-4	1	565	May 18, 2022
Improving Training accuracy of LSTM in C3W4 assignment Natural Language Processing in TensorFlow week-module-4	6	386	August 3, 2023
Does only transformer need padding using max_length? Sequence Models coursera-platform	8	880	March 8, 2023
How was achieved the different shapes fitting to model? NLP with Sequence Models week-module-3	5	397	September 17, 2023
Course 5: Sequence models - Handling the padding? Sequence Models coursera-platform	1	573	June 27, 2021

C3_W3_Lab_2_multiple_layer_LSTM

Related topics