In the second video of the module “Transformer architeture”, it described as the input is embedded creating a first guess of the semantic meaning of the prompt and also how LLM is fed with a position vectore of the tokenised prompt. Since a token can either occupy or not occupy a given position in a sentence, I would expect it to be a binary vector. For example: “I ate pasta”. For “I”, I would expect [1, 0,0] or something like that, since “I” is the first token in the sentence, not the 2nd nor the 3rd.
Instead in the video is shown to have decimals and I do not understand the meaning of decimals: either a token occupy a given position or it does not.
Perhaps your assumption is incorrect.
I can understand that and my question is asking an explanation but yours is completely useless: why even did you bother to write it? It was just to feel better? it does not add any value to this forum
I was attempting to encourage you to explain why you believe the vector values should be 0’s or 1’s. That seemed to me to be the key to the question.
I tried to explain
Positional Encoding in Transformers
The positional encoding is defined as:
For even dimensions:
For odd dimensions:
Explanation of Terms
-
d_model: Total number of dimensions in the model (e.g., 512). -
i: Dimension index (e.g., the 5th dimension of the 512-dimensional vector). -
pos: Position of the token in the input sequence
(e.g., in "My name is Muzammil", the word "name" has position 1 if indexing starts from 0).
These encodings are added to the input embeddings to give the model a sense of token order without using recurrence.
this is used for all alternating indices of a certain position word or token per say.
Positional Encoding for Position = 2, Dimension Index = 2
Using the standard Transformer formulas:
For even dimensions:
For odd dimensions:
Computed Values
-
Given:
- ( pos = 2 )
- ( i = 2 )
- (d_model = 512 )
-
Angle Rate:
\frac{2}{10000^{\frac{4}{512}}} = 1.9293232398223983 -
PE(pos=2, 2i=4):
\sin(1.9293) = 0.9364 -
PE(pos=2, 2i+1=5):
\cos(1.9293) = -0.3509