Week 2 negative sampling

sir in negative sampling lecture at time block 9: 18 to 10:48
I can’t understand that part can you please elobrate on it

sir, at 10:21 in the formula of heuristic value, why the sum of frequency is over vocabulary size wouldn’t it be over the size of text corpus

Can you please send a link to that video so maybe we can find that portion as well!

From what I understand is that if you use the corpus than words like the, and, is have high frequency and it would take precedence, the next one 1/vocab size is still not very representative of the distribution, so they found the formula that Prof Andrew gives which works better.