A note on the featurization view of word embeddings

I didn’t catch up with this idea illustrated in this picture , sir Andrew said that ew1 represent the first dimension, isn’t the gender
If there is anyone who can offer assistance, I sincerely appreciate it.

My interpretation is that we are dealing with vectors in a 300 dimensional space here (assuming that’s the size of the embeddings) and you can’t guarantee that the learned values exactly align with the coordinate axes. The other point here is that we don’t really plan or control what the model learns and we are just trying to interpret it after the fact.

Hello @dorbez_fradj,

Welcome to the community!

I will try to give one more simple explanation, but if you still feel confused, then you will need to also share with us your current understanding on that slide or on the concept of embeddings’ diemensions, because it will help us see what could be most confusing.

image

The above table is representing each word by four dimensions - gender, royal, age, and food. However, word embeddings are a different representation which can represent the same set of words with any number of dimensions (almost always less than the original number which is four). Also, you almost always can’t find any dimension in the embedding to be perfectly align with any of the original four dimensions in the table. For example, if we look at the gender dimension:

image

The four words (Man, Woman, King, Queen) are -1, 1, -0.95, and 0.97 in the gender dimension. However, in the embedding dimensions, you almost won’t be able to find any dimension that uses the same values (-1, 1, -0.95, 0.97) to represent the four words.

Cheers,
Raymond

I appreciate your help in clarifying the concept for me, thanks

thank you for your help i really appreciate it