For the sake of completeness, I can share my own calculations to check the inner workings of this weeks C3_W3 assignment. Maybe someone will find it useful.
-
The example of a batch:
-
The example of the Embedding weights:
-
The first sentence embedded example:
Note that the same words have the same embeddings (highlighted in blue and orange). -
The example of LSTM input weights for first layer W_ih_l0:
-
The example of LSTM hidden state weights for fist layer W_hh_l0:
-
The example of LSTM biases (for both input and hidden state):
The example of calculations :
-
t = 0 (“Thousands”)
-
t = 1 (“of”)
-
t = 2 (“demonstrators”)
-
t = 17 (note jump to step 18 - the word “of”)
Note:
You can compare the different values between words “of” in step t=1 and step t=17. Note that inputs (the embeddings are the same) but because of different hidden states c_16 and h_16, the output is different.
The example output of LSTM for the first sentence:
- The example of Linear (Dense) layer weights (W and b):