The Skip-Gram Algorithm

dolu_solana · July 1, 2022, 12:53pm

In week 2 of the sequence model video, Andrew describes the skip gram algorithm as one is used to predict the probability of a target word give context words. However, many articles online claim that Skip Gram is used to predict the Context given the Target, and not the other way around. Can anyone help clarify this I am I missing something?

balaji.ambresh · July 1, 2022, 1:23pm

The lecture is correct. Context refers to the word that are fed into the model. Target refers to the predicted word by the model. To repeat the example from the lecture, orange is the input aka context word and juice is the predicted word by the model. Please watch the lecture again and update this thread with specific details on what is unclear.

Elemento · July 1, 2022, 1:30pm

Hey @dolu_solana,
Welcome to the community. To add to @balaji.ambresh’s explanation, the fact that

seems to be logically incorrect. This is because of the way we have put forth the definitions of context and target, which Balaji has explicitly stated.

So, if you are feeding a “target” word, then by definition, it will be a “context” word, and the predicted “context” word, by definition will be a “target” word. As for the online articles, I can see the same concept that Prof. Andrew has discussed. For instance, consider this article. It has given the same definition too, with the only difference of the context word being considered as “ants” instead of “orange”. Can you please provide the link to one of the articles that might be confusing you?

Regards,
Elemento

SrivathsanM · December 4, 2024, 2:21am

Yes you are right!
I am also encountering the same issue. And I don’t think the justifications offered here makes sense.

In the video Andrew says that the input is the context word which is orange for which we’re trying to predict the target word that is juice.
But everywhere else (On word embeddings - Part 1) says the other way. Says, that the skip-gram model takes the target words as input and returns a probability distribution for the context words.
I don’t understand either of the explanation offered here.

balaji.ambresh · December 6, 2024, 3:03am

Please don’t create duplicate posts:

Topic		Replies	Views
Skip-gram Model Confusion in video and external resources Sequence Models week-module-2 , coursera-platform	1	21	December 5, 2024
Understanding the skipgram model Sequence Models coursera-platform	1	634	May 13, 2021
Skip gram model clarification Sequence Models coursera-platform	1	522	March 13, 2022
Word2Vec: Confusion over the specific definition of 'context' and 'target' words Sequence Models coursera-platform	3	613	March 23, 2025
Why can't Skip Grams use logistic regressions? Sequence Models coursera-platform	2	576	July 24, 2021

The Skip-Gram Algorithm

Related topics