[Assignment: Jazz Improvisation with LSTM] Why should Y get reordered?

It’s stated in the assignment that: "Notice that the data in Y is reordered to be dimension (𝑇𝑦,π‘š,90)(Ty,m,90), where 𝑇𝑦=𝑇π‘₯Ty=Tx. This format makes it more convenient to feed into the LSTM later. " I am wondering how can reordering Y to a different shape make training more convenient? Is it a necessary transformation? From my understanding, we need to calculate the cost for each training sample in each batch. In order to do that, we essentially need to do somthing like (yhat - y).During training, Input X gets passed into LSTM_Cell and the output should be in the same shape as X, besides that it is shifted leftward by one t. If the output from LSTM (yhat) is in the same shape as input X, shouldn’t we keep training_Y also in the same shape as X? Thank you!

Hi @SumSum

It has something to do with our implementation in djmodel function. In Step 2, we manually loop over sequence length and append each iteration out to the whole sequence outputs.
In each iteration, we slice input X and take one time step only, thus, the input shape of each iteration is (m, 90). Just like you said, the output shape of LSTM_Cell is same as input, so the out shape of each iteration is (m, 90), too. In Step 2.E, we append out to outputs, as a result, the shape of outputs is (Tx, m, 90). That’s why we reorder label data to shape (Ty, m, 90).

1 Like