sir in negative sampling lecture at time block 9: 18 to 10:48
I can’t understand that part can you please elobrate on it
sir, at 10:21 in the formula of heuristic value, why the sum of frequency is over vocabulary size wouldn’t it be over the size of text corpus
Can you please send a link to that video so maybe we can find that portion as well!
From what I understand is that if you use the corpus than words like the, and, is have high frequency and it would take precedence, the next one 1/vocab size is still not very representative of the distribution, so they found the formula that Prof Andrew gives which works better.