How are word embedding calculated end to end

From what I understand, in the assignment we are input the whole sentence and the output is sentiment. But in between there is a word embedding layer, so how are embedding being calculated for each word as I am inputting the whole sentence.

Like, my input is
[Thank You 0 0 0]
[20 52 0 0 0]

Here, I am inputing in the model whole sentence (vector) to identify the sentiment, but I am also getting embedding for each word separately, which is unlike CBOW model.

Am I missing something crucial here, and should I re-watch all videos for this weeks again?

1 Like

Hey @Sonu_Chhabra,
The idea is really simple here. The Embedding layer gets trained just like any other layer in this case, i.e., it doesn’t get trained individually like we did in CBOW. It is a layer in the model, and through back-propagation, it learns the embeddings of the various tokens.

Although, if you want, you may train it individually like we did in CBOW, and then use the pre-trained weights in the new models. You can also adopt this approach for using the pre-trained embeddings trained by others. I hope this helps.

Cheers,
Elemento

Hey @Elemento,
Yeah, I felt that but then it should be embedding for the whole sentence rather than a word (also instead of CBOW some different embedding model is used).
But at the end of the assignment 1, we were visualizing the embedding related to a single word.

1 Like

Hey @Sonu_Chhabra,
In CBOW as well, we learn the word embeddings for the different words, and not for sentences as a whole. I am a little perplexed as to what you have written in your last reply. Can you please elaborate your query a bit more?

As I stated, the word embeddings in this assignment are learnt as any other layer in the model, and if we want, we can replace this trainable layer with a fixed layer having pre-trained embeddings. These pre-trained embeddings can come from a CBOW model trained separately on a corpus, from a Word2Vec model trained separately on a corpus, or any other word embeddings technique that produces embeddings for the different words.

Cheers,
Elemento

Hey @Elemento,
In the end-to-end NN model, the initial layer is the word vector, and we pass in the vector to word-embedding layer to get the embedding, like
Sentence: [Thank You Very Much 0 0 0]
Word Vector: [20 52 31 33 0 0 0]

And we get 1 output from embedding layer say of size [400, 1], so who do I know which embeddings are related to word - Thank.
This is unlike CBOW model where I input context words to get a particular word embedding.

Hey @Sonu_Chhabra,
If you take a look at the implementation of the classifier function, you will find out that the embedding layer, in your example produces an output of dimensionality (embedding_dim, number_of_tokens), i.e., (400, 7). It is only after the Mean layer that you get an output of dimensionality (400, 1). So, if you want to get the word embeddings word-by-word, you can simply remove the mean layer.

Although, in this assignment, the task at hand is not to obtain the word embeddings, but to perform sentiment analysis, so I don’t know why would you want to get the word embeddings word-by-word in the first place, for following the assignment.

Anyways, I hope this resolves your query.

Cheers,
Elemento

Thanks @Elemento for the help.