Some confusion on Word2Vec model

arvyzukai · July 5, 2023, 6:08am

There are some misconceptions in your post and your English is hard for me to understand. So let me address this point first:

That is not true. Softmax is just an operation that fits the vector to values from 0 to 1. An example.
In other words, the outputs (values) of the network are all over the place - negative, positive, big numbers, small numbers - and if you want them to be interpreted as probabilities (going from 0 to 1, and the sum of them to be equal 1), then you can use softmax.
So the theta is actually not a parameter (of the model), but the outputs of the model.

Yes the embedding matrix (values) are randomly initialized at the start of the training. These values then are constantly updated according how well the model predicts the targets - values that contributed to lowering the probability of the correct word are reduced and values that contributed to increasing the probability of the correct word are increased.

Cheers

Topic		Replies	Views
C5W2 Word2Vec video - theta Sequence Models coursera-platform	2	561	January 16, 2023
Word2Vec theta matrice Sequence Models week-2 , coursera-platform	6	265	August 9, 2024
Why do we need the softmax parameters in word2vec? Sequence Models coursera-platform	10	589	August 26, 2024
What exactly is theta in word embeddings? Sequence Models coursera-platform	4	669	May 6, 2023
Theta parameter introduced In Class 5, week 2 Sequence Models coursera-platform	5	546	August 8, 2024

Some confusion on Word2Vec model

Related topics