In this assignment, we set the dimension of the input data x to (n_x, m, T_x), with n_x being the size of the corpus, m being the batch size, and T_x being the timesteps.

Since T_x is a fixed number, do we assume that all input has the same number of timesteps? (e.x. if inputs are sentences, then all sentences will have the same length). If so, how do we perform forward propagation with different input lengths?

I understand the idea of padding, but how does adding an indicator work? Assume that we add an encoded version of “\n” to the end of each sentence, they still have different lengths of timesteps and cannot be stored into a 3d array. I will appreciate it if you can give me a concrete example.

Does that mean vectorization is no longer applicable? Besides, I couldn’t find any related blog posts or papers that address this problem, could you please send me something that I can read about?