To generate negative example, we pick word randomly from the dictionary. And to select negative samples, we are using the author formula shown in the video end.
What is the difference between these two statements ? Bit of confused here
I will say that one aspect of this slide / explanation that I don’t believe is clear is how this method would avoid the problem Andrew mentioned of ending up with negative samples made up entirely of “the”, “of”, “and”, etc… I am okay with waving my hand and letting it go as far as developing an intuition here - but I feel that it’s worth noting.