In this assignment, we set the dimension of the input data x to (n_x, m, T_x), with n_x being the size of the corpus, m being the batch size, and T_x being the timesteps.
Since T_x is a fixed number, do we assume that all input has the same number of timesteps? (e.x. if inputs are sentences, then all sentences will have the same length). If so, how do we perform forward propagation with different input lengths?
Thanks for answering in advance.
The exercise instructions tell you this:
- Tx will denote the number of timesteps in the longest sequence.
Shorter sentences are padded to the correct length.
Or, instead of padding shorter sentences, you can use a code value that indicates the end of the sentence.
I understand the idea of padding, but how does adding an indicator work? Assume that we add an encoded version of “\n” to the end of each sentence, they still have different lengths of timesteps and cannot be stored into a 3d array. I will appreciate it if you can give me a concrete example.
To use an end marker, you’d probably have to re-structure the code so the examples can be different lengths.
Does that mean vectorization is no longer applicable? Besides, I couldn’t find any related blog posts or papers that address this problem, could you please send me something that I can read about?
Sorry, I don’t have any references for this.
That’s okay, thank you for your time!