Course 5 Week 4 Transformer Padding Mask

spm1001 · September 22, 2021, 2:27pm

So I’ve a question about the introductory text.

It shows

[“Do”, “you”, “know”, “when”, “Jane”, “is”, “going”, “to”, “visit”, “Africa”],

Being vectorised as

[ 71, 121, 4, 56, 99, 2344, 345, 1284, 15]

The first sentence has 10 words, but the vectorisation has only 9 numbers. Looking at the next rows, ‘15’ is the token for ‘Africa’ but then doesn’t that mean there is a word missing in this vector? Sorry if that is a silly question.

TMosh · September 22, 2021, 2:47pm

One of the words is not in the vocabulary.

spm1001 · September 22, 2021, 2:57pm

Oh - ok - thank you that was an amazingly speedy response - not sure which timezone you’re in, but I hope it’s not intruding on your out of work hours!

Would words missing from a dictionary normally be dropped like that in a Transformer or can there be a dummy token for words that aren’t known, as there were in some of the previous models iirc?

Topic		Replies	Views
Week 4 : 10 words are vectorized to only 9 numbers? Sequence Models coursera-platform	1	513	June 15, 2022
The tokens that decoder block use Sequence Models week-4 , coursera-platform	3	209	April 15, 2024
Course 5, Week 4: Transformer - Exercise 5 (Encoder) and 7 (Decoder) Sequence Models coursera-platform	3	419	September 21, 2023
Questions about Transformer Models Generative AI with Large Language Models week-1	2	363	October 23, 2023
Few doubts regarding the pre-training and working of t5 transformers NLP with Attention Models week-3	2	332	November 9, 2023

Course 5 Week 4 Transformer Padding Mask

Related topics