How do we get the embedding matrix from the word2vec model?

Max_Rivera · October 9, 2022, 8:37pm

From my understanding in word2vec we are learning a mapping between a word and its context. The network consists of a single layer followed by a softmax. So we have 2 sets of weights, one for the first layer (W1 in the image below) , and one for the softmax layer (W2 in the image below). I’m assuming that W1 is the weight matrix we want to keep as the final embedding matrix, and we are discarding W2 weights. Is this correct? Then if we are predicting a word from a context, which weights would we be keeping?

TMosh · October 9, 2022, 11:19pm

If you have an input layer and a hidden layer and an output layer, you have two weight matrices. Both are vitally important.

Max_Rivera · October 9, 2022, 11:35pm

@TMosh Yes we would keep both weight matrices if we want to continue predicting context given a word. But if we wanted to transfer the embeddings for use with another problem, or maybe I just want to visualize similarity with t-SNE, wouldn’t I just need the first weight matrix?

Elemento · October 10, 2022, 2:30pm

Hey @Max_Rivera,

I would define the aim of Word2Vec a bit differently. In my opinion, it should be defined as learning a mapping between words and their embeddings (say 300-dimensional vectors). Learning to predict a word from it’s context, is just the task that we are exploiting to learn these embeddings. Once we get the embeddings, the model trained isn’t of much significance to us, at least as per our aim. In fact, you can even understand this by just breaking the name; Word2Vec = Word → Vectors (or Embeddings).

And yes, if you want to obtain the embeddings (once again which is the primary aim), you will need only the first weight matrix. I hope this helps.

Cheers,
Elemento

Topic		Replies	Views
Word embedding parameters + transfer learning Sequence Models coursera-platform	3	586	May 24, 2022
Why do we need the softmax parameters in word2vec? Sequence Models coursera-platform	10	589	August 26, 2024
How do we obtain the embeddings from CBOW? Sequence Models coursera-platform	1	499	October 11, 2022
Natural Language Processing & Word Embeddings Sequence Models coursera-platform	3	672	May 30, 2025
Understanding Word Embeddings Sequence Models week-2 , coursera-platform	2	286	February 14, 2024

How do we get the embedding matrix from the word2vec model?

Related topics