Question for the vector representation

In the assignment 1, Q1, are we converting every token in the inputs to a vector with 1024 dimension?

Yes, you understand that correctly.

Thank you! So in the Tl. embeeding layer, we tell the model vocabulary size and dimension, it will automatically generate the embedding for us, right? Do you know with what methodology to create the embedding? Is it CBOW embedding?

1 Like

I explained a simple example of Embedding weights here.

If you look at the code, trax uses `RandomNormalInitializer(1.0)` which is just a Random Normal distribution (red curve).
(Side note, donâ€™t worry if you do not understand: Iâ€™m not sure you would want to know more details now about weight initialization and their sizes but in short, here 1.0 means that by default random Normal distribution is multiplied by 1 - so not changed, but when you have big models you might want to initialize with smaller weights).

If you ask about the start (when the model is initialized for the first time - before training (or â€śseeingâ€ť any example), the embedding table is just random numbers with Normal distribution.

Now, when we train the model, we have chosen its architecture (or in other words, made design choices) - how are we going to make predictions?

If we decided that word(token) order does not matter, then the approach is called Bag of Words. If we care about the words around the word(token) then the approach is called Continuous Bag of Words (but the order still could not matter, for example set([â€śIâ€ť, â€śloveâ€ť, â€ślearningâ€ť]) could be the â€ścontextâ€ť and the â€śtargetâ€ť could be [â€śNLPâ€ť]).

So, if we decided this is the way to go, then yes, the Embedding table would be updated in accordance if the model correctly predicts [â€śNLPâ€ť] when the inputs are set([â€śloveâ€ť, â€śIâ€ť, â€ślearningâ€ť]). But if we would have chosen other path (how we provide inputs, RNN or Transformer etc.) then the Embedding table would be updated in accordance how the model is able to predict those outcomes.

1 Like