C5W2 Word2Vec video - theta

Chris_Genly · January 14, 2023, 11:21pm

In the video on Word2Vec, at about time 4:30, the parameter theta is used in the softmax calculation. I don’t understand where theta comes from or the definition given.

In the slide it says: theta(t) = parameter associated with output t.

Which parameter?

Since theta and Ec are multiplied together, they have to have the same dimension. So is theta just an embedding vector like Ec, but for the target word instead of the context word? I don’t expect that to be the case. It it were I’d expect it to be labled Et, where the subscript t is for the target.

TMosh · January 15, 2023, 3:14am

All he’s doing there is writing out the algebra for the softmax function, using (theta_transposed * x) to indicate the predicted values. It isn’t specific to this particular example.

‘theta’ there represents whatever weights were learned by the process ahead of softmax. In this case, I think those are the embedding matrix E, and (theta-transpose * x) is ec (since he writes that e_c = E * o_c) at time 3:51.

In this case I think the appropriate context is that softmax is performed on the e_c vector.

Chris_Genly · January 16, 2023, 8:52am

Thank you! I don’t remember him using the notation theta before. But when you point out its just the weights, that helped. I was stuck in the mindset that there must be a weight matrix W before the softmax unit. But now I realize, the theta’s are just rows in this W matrix.

Topic		Replies	Views
Word2Vec theta matrice Sequence Models week-2 , coursera-platform	6	265	August 9, 2024
Theta parameter introduced In Class 5, week 2 Sequence Models coursera-platform	5	546	August 8, 2024
Why do we need the softmax parameters in word2vec? Sequence Models coursera-platform	10	589	August 26, 2024
Some confusion on Word2Vec model NLP with Sequence Models week-2	1	485	July 5, 2023
What exactly is theta in word embeddings? Sequence Models coursera-platform	4	669	May 6, 2023

C5W2 Word2Vec video - theta

Related topics