Number of LSTM units in Trax

For the sake of completeness, I can share my own calculations to check the inner workings of this weeks C3_W3 assignment. Maybe someone will find it useful.

The example of calculations :


You can compare the different values between words “of” in step t=1 and step t=17. Note that inputs (the embeddings are the same) but because of different hidden states c_16 and h_16, the output is different.

The example output of LSTM for the first sentence:

The output of the model:

1 Like