Small typos in W4 assignment


I found the following typos in the “Programming Assignment: Transformers Architecture with TensorFlow”:

In this section describing sequence lenghts, the final example suddenly has 4 truncated sequences in stead of 3:

which might get vectorized as:
[[ 71, 121, 4, 56, 99, 2344, 345, 1284, 15],
[ 56, 1285, 15, 181, 545],
[ 87, 600]
When passing sequences into a transformer model, it is important that they are of uniform length. You can achieve this by padding the sequence with zeros, and truncating sentences that exceed the maximum length of your model:
[[ 71, 121, 4, 56, 99],
[ 2344, 345, 1284, 15, 0],
[ 56, 1285, 15, 181, 545],
[ 87, 600, 0, 0, 0],
Sequences longer than the maximum length of five will be truncated

Extra in highlighted below:

5 - Decoder
The Decoder layer takes the K and V matrices generated by the Encoder and in computes the second multi-head attention layer with the Q matrix from the output (Figure 3a).

Finally in the code comment, there is an extra is

class Decoder(tf.keras.layers.Layer):
The entire Encoder is starts by passing the target input to an embedding layer
and using positional encoding to then pass the output through a stack of

Hi @Izak_van_Zyl_Marais,

Thanks for letting us know. We’ll have these fixed soon.


Hi @Izak_van_Zyl_Marais,

I have fixed two of the typos you pointed out. Thanks again!

As for the first,

The sequence length is set as 5, so when the first sentence is considered, after the first 5 words, it stops and creates another sequence. That’s why the first example is broken into two sequences. This is also mentioned in what you shared above.


Thanks for the feedback.