No matter what I do, the outputs are off expected results. Output is:
The hash table at key 0 has 1352 document vectors
The id table at key 0 has 1351
The first 5 document indices stored at key 0 of are [ 3. 8. 16. 18. 29.]
What may be going on? Thank you.
1 Like
It seems like your hash table has an extra key. How did you initialize your hash table?
hash_table = { i : np.empty((1, vecs.shape[1])) for i in range(0, num_buckets)}
I previously tried [] as well but to no avail.
Hi, I think you should check again on hash_table initialization.
1 Like
In my implementation the elements of the properly initialized hash_table are of type <class ‘list’> but when I use the code you provide above I get type <class ‘numpy.ndarray’> which may also be why you ended up with floating point indices in your other question. +1 on @Vu_Hoang_Ngo ’s suggestion to reconsider the initialization of the hash_table object.
Thanks for your input. These turned out to be red herrings. The real issue was that in my hash_value_vector function, I was calculating the sign incorrectly by doing a 0 if <= 0 else 1; this should have been 0 if < 0 else 1. The test function passed for that one case where it should be 768 at index 0. Because of this error, I went on a wild goose chase trying all sorts of hacks in my hash_value_vector function resulting in the errors. The reason for the floating point numbers was that I was using np.append() instead of index_table.append(). Life is good finally!! Thank you all.
1 Like
I believe there is a problem with the notebook. I got this same error and when I ran the next cell for the actual test cases, all of them passed. Who can review the notebook and confirm with authority that what we doing is correct / wrong?