C5W1A1sequences for training with different length

Hello. In the A1 notebook assesment , at the start of it , wrote:
In this notebook, Tx will denote the number of timesteps in the longest sequence.
So, if , for example, in training dataset I got 2 sequences. First with Tx = 5, second - Tx=10. So for
The 3-dimensional tensor x of shape (nx, m=2, Tx=10) represents the input x that is fed into the RNN

But for first slicing along axis m there are only 5 one-hot vectors. Other 5 position filling with zero - vectors (pading)- is it correct?

Thanks in advance.

1 Like

Yes, I think that must be what is happening. It’s been a couple of years since I watched the lectures, but I’m trying to go through and see if Prof Ng specifically addresses this situation. I have not yet found it, but I think it must work the way you describe. What we can see is that things are specifically set up to vectorize forward propagation across the “samples” dimension. In the logic in the C5W1A1 assignment, I don’t see anything checking for all zeros in the x^{<t>} entries, so it must just run the computations and let the chips fall where they may.

I will continue watching the lectures in Week 1 and let you know if I find anything more concrete about how it works in this scenario.

Thank you very much.