We use number of hidden cells for the pre attention lstm as 32 (n_a). Shouldnt this be Tx. Aa the preattention lstm is basically a decoder. the number of units in that should be same as Tx
Please note that the number of hidden units used in the Bi-LSTM layer doesn’t matter. What counts in the 1st dimension after the batch dimension, which is Tx. With this reply in mind, kindly read the markdown again.