Why Theta is transposed in Word2Vec Model

Behnam2 · May 25, 2023, 10:09pm

in Word2Vec of NLP and Word Embeddings of course 5, I understand that theta is actually parameters (weights) associated with the denselayer followed by the softmax, however, given the previous conventions in the previous courses why it’s transposed?

TMosh · May 25, 2023, 10:22pm

It’s an implementation detail that Andrew often includes when he’s using “theta” in the notation.

He assumes that all vectors are column vectors of size (n x 1).

So in order to compute their dot product, the first one needs to be transposed, so its size becomes (1 x n).

Then the dot product dimensions are (1 x n) * (n x 1), which gives a scalar result.

Topic		Replies	Views
C5W2 Word2Vec video - theta Sequence Models coursera-platform	2	561	January 16, 2023
Why do we need the softmax parameters in word2vec? Sequence Models coursera-platform	10	589	August 26, 2024
Word2Vec theta matrice Sequence Models week-2 , coursera-platform	6	265	August 9, 2024
Theta parameter introduced In Class 5, week 2 Sequence Models coursera-platform	5	546	August 8, 2024
What exactly is theta in word embeddings? Sequence Models coursera-platform	4	669	May 6, 2023

Why Theta is transposed in Word2Vec Model

Related topics