C1W4 problem in w4_unittest.test_approximate_knn

Hi,

I tried completing W4 assignment but had a lot of trouble with the last part. test_approximate_knn returned:

Fast considering 77 vecs
Wrong chosen neighbor ids. 
	Expected: [51, 2478, 105].
	Got: [2478, 1876, 253].

I looked at the cosine similarity of those documents to document 0 and all of them had the same value of 0.9999999999999998, so I believe both answers should be valid.

I got a different set of neigbours because I have not read hints for C8 and sorted the similarity list on my own instead of using np.argsort. Actually, 7 documents out of 77 examined have the same cosine similarity and any subset of 3 of those should be a valid answers, shouldn’t it? Or maybe I’m wrong and I should be examining less than 77 documents or different 77 docs?

1 Like

Hello @lukatiks !
Thank you for your post, it is a pleasure to have you on the platform :slight_smile:
I found these few articles regarding your questions and I hope they’ll be helpful to give you a full understanding on cosine similarity ( specially the 1st one):
Finding Word Similarity using TF-IDF and Cosine in Python
How does cosine similarity work?
Cosine Similarity – Understanding the math and how it works

I hope you find this useful, have a nice day!