In the dataset that is downloaded. Why do we use padded_batch for the dataset rather than using pad_sequences ?
When using padded_batch, the lengths of the sequences are only padded to the maximum length of the batch (and not the whole dataset). Which means each individual batch will have different sequence lengths.
When using a Bidirectional LSTM for text classification, is it not important how long each individual sequence is? i.e. is the number of iterations of the LSTM Cell not something that’s important?