Making sense of W2 as embedding matrix

davidpet · March 10, 2023, 11:49am

Other than being the correct dimensions, I’m confused on why you could use the W2 matrix from CBOW as your embedding matrix. Since transposed matrices are not their own inverse, what relationship would the W2 matrix have to transforming from the one-hot vectors into embeddings - since it was trained on going the other direction?

arvyzukai · March 13, 2023, 7:13am

Hi @davidpet

The same reason why you could use the W1 matrix.

Word embeddings is not some “correct” way of solving an equation of some kind that has one correct answer. But it is a useful tool to achieve your goals (primarily - by reducing the sparsity of one-hot approach to some smaller manageable amount on numbers - the embedding space - you can train a model that can predict sentiment, translate languages or whatever your goal is).

Not only that, but also the ReLU application. As I said, the goal is not to get back to W1 or x.

The only relationship (for W1 and W2, also don’t forget b1 and b2) is the cost function and data.

Changing the cost function would get you different weights (W1, W2…), also having different weights because of random initialization (like .rand(N,V) and .rand(V,N)) , different random batches or data altogether and other factors would result in different Word Embeddings.

They all would be “wrong” and “correct” at the same time - the only thing that matters is your goal. Which of them does help you predict the sentiment or translate the languages best?

I’m not sure what you mean. Could you elaborate?

davidpet · June 2, 2023, 7:31am

Makes sense - thanks!

Topic		Replies	Views
About word embeddings in the CBOW model NLP with Probabilistic Models week-4	1	519	December 1, 2022
Intuition behind using the weights of a CBOW model as word embeddings NLP with Probabilistic Models week-4	2	569	September 5, 2023
DLS5 W2 Learing Word embeddings Sequence Models	7	551	September 12, 2023
How do we get the embedding matrix from the word2vec model? Sequence Models	3	717	October 10, 2022
How do we obtain the embeddings from CBOW? Sequence Models	1	496	October 11, 2022

Making sense of W2 as embedding matrix

Related topics