About word embeddings in the CBOW model

green_sunset · November 22, 2022, 10:39am

Hello everyone, In course 2 (week 4) of the NLP specialization, it is explained that word embeddings can consist of the columns of the matrices W1 or W2, or the average of the latter, in the neural network associated with CBOW. Before I learnt about CBOW, I was thinking that one would simply use the hidden layer representation of each input word (meaning that the one hot encoding of that word would be the input of the network) as the feature representation of that word. Would you think that this approach could work? Why is that possibility put aside in the CBOW model?.. Thank your for your comments and answers in advance
Michel

arvyzukai · December 1, 2022, 7:12am

Hi Michel,

I do not fully understand what are you proposing? To me, the way I understand your post it seems very similar to Embedding layer (docs and source code). Here token/word_id is mapped to a vector (if I understand you correctly as “the hidden layer representation”) of certain size. Is it the case or are you thinking about something different?

Topic		Replies	Views
How do we obtain the embeddings from CBOW? Sequence Models coursera-platform	1	500	October 11, 2022
Intuition behind using the weights of a CBOW model as word embeddings NLP with Probabilistic Models week-module-4	2	576	September 5, 2023
How are word embedding calculated end to end NLP with Sequence Models week-module-1	6	600	January 10, 2023
Making sense of W2 as embedding matrix NLP with Probabilistic Models week-module-4	2	499	June 2, 2023
Question for the vector representation NLP with Attention Models week-module-1	3	563	April 27, 2023

About word embeddings in the CBOW model

Related topics