In week 2 of the sequence model video, Andrew describes the skip gram algorithm as one is used to predict the probability of a target word give context words. However, many articles online claim that Skip Gram is used to predict the Context given the Target, and not the other way around. Can anyone help clarify this I am I missing something?
The lecture is correct. Context refers to the word that are fed into the model. Target refers to the predicted word by the model. To repeat the example from the lecture,
orange is the input aka context word and
juice is the predicted word by the model. Please watch the lecture again and update this thread with specific details on what is unclear.
seems to be logically incorrect. This is because of the way we have put forth the definitions of
target, which Balaji has explicitly stated.
So, if you are feeding a “target” word, then by definition, it will be a “context” word, and the predicted “context” word, by definition will be a “target” word. As for the online articles, I can see the same concept that Prof. Andrew has discussed. For instance, consider this article. It has given the same definition too, with the only difference of the context word being considered as “ants” instead of “orange”. Can you please provide the link to one of the articles that might be confusing you?