A note on the featurization view of word embeddings

dorbez_fradj · July 30, 2023, 8:27pm

I didn’t catch up with this idea illustrated in this picture , sir Andrew said that ew1 represent the first dimension, isn’t the gender
If there is anyone who can offer assistance, I sincerely appreciate it.

paulinpaloalto · July 30, 2023, 8:52pm

My interpretation is that we are dealing with vectors in a 300 dimensional space here (assuming that’s the size of the embeddings) and you can’t guarantee that the learned values exactly align with the coordinate axes. The other point here is that we don’t really plan or control what the model learns and we are just trying to interpret it after the fact.

rmwkwok · July 31, 2023, 12:16am

Hello @dorbez_fradj,

Welcome to the community!

I will try to give one more simple explanation, but if you still feel confused, then you will need to also share with us your current understanding on that slide or on the concept of embeddings’ diemensions, because it will help us see what could be most confusing.

The above table is representing each word by four dimensions - gender, royal, age, and food. However, word embeddings are a different representation which can represent the same set of words with any number of dimensions (almost always less than the original number which is four). Also, you almost always can’t find any dimension in the embedding to be perfectly align with any of the original four dimensions in the table. For example, if we look at the gender dimension:

The four words (Man, Woman, King, Queen) are -1, 1, -0.95, and 0.97 in the gender dimension. However, in the embedding dimensions, you almost won’t be able to find any dimension that uses the same values (-1, 1, -0.95, 0.97) to represent the four words.

Cheers,
Raymond

dorbez_fradj · July 31, 2023, 6:51am

I appreciate your help in clarifying the concept for me, thanks

dorbez_fradj · July 31, 2023, 6:52am

thank you for your help i really appreciate it

Topic		Replies	Views
Week2, prog_assgn_1, equalize Sequence Models coursera-platform	1	538	December 2, 2021
C5 Week 2 dimensions of embedding matrix Sequence Models coursera-platform	4	635	May 24, 2021
C5W2 Assignment 1 Debiasing - About "orthogonal axis" Sequence Models coursera-platform	3	551	September 30, 2021
Mistake in the quiz at the end of the video "Word embeddings" NLP with Probabilistic Models week-4	3	517	November 12, 2021
Question on Sentiment Classification Lecture Sequence Models week-2 , coursera-platform	6	262	January 19, 2024

A note on the featurization view of word embeddings

Related topics