/notebooks/C4W3_Assignment.ipynb
From what I understand, sentinels are used in the decoder of the T5 as targets. They are like placeholders.
Lets assume vocab size is 32,000.
Lets assume I=2, Love =3, learning=4, and =5
Eg sentence: “I love machine learning and deep learning”
After random selection, the words “machine“ and “deep“ are selected for masking, and the input to the encoder will be as follows:
“2 3 31998 4 5 31997 4“
The input to the decoder will be :
“31998 machine 31997 deep“
But, the vocab already has valid lookup in pos 31998 and 31999 as we have already seen in the get_sentinels() method, which prints out:
The sentinel is <Z> and the decoded token is: Internațional
The sentinel is <Y> and the decoded token is: erwachsene
The sentinel is <X> and the decoded token is: Cushion
…which means index 31999 in vocab is associated with the word “International“.
Why are we associating a valid vocab index with a sentinel? How is it working out?