Why do we need Multiple Planes?

a bit late time on my side :sweat_smile: Is it better now? I edited the message

:sweat_smile: there are many things I don’t understand in your question.

  • First, we do not get elements (what do you mean by elements, tokens?) by ids; Or do you mean, that when we search, we limit our search by hash id? (Then the idea would be correct.)
  • Second, these tokens (elements if you call them) do not contain (as “contain” means “within”) hashes, the tokens are assigned a hash (meaning that they are grouped by some hash id)
  • Third, hashes are not transformed. The transformed space (after translation) is divided into regions that can be represented by hash id.

I’m not sure what you mean by that. But I think you’re asking if the closest points are always in the same hash id (region), then the answer is that no, they are not always closest that is why it is called approximate nearest neighbors.

Thank you very much for your answers! I watched this topic few times and let me to do a few conclusions. Edit or extend my explanation, please, if I say something incorrect.

Assume we translate from English to French. We need Locality sensitive hashing to not find cosine similarity of each English word to each French word. We can achieve it thanks to random planes, that will give opportunity to create buckets by spaces between this planes.

To understand a place of vectors in one of this spaces we use normal (90% to plane) vectors, that can have origin from any dot of planes. It does’t matter, because sign of projection of vector will not very change depending of source of normal.

So, with help of this formula we can define unique ids to our spaces (hash tables) and assign this id to each of our vectors.
изображение

We should choose planes quantity by quantity of vectors that we want to see in each space. It is something like “precision” of our algorythm and we can choose it manually. This is example from assigment for 16 vectors in a space.

  • Each plane divides the space to 2 parts.

  • So 𝑛 planes divide the space into 2^𝑛 hash buckets.

  • We want to organize 10,000 document vectors into buckets so that every bucket has about 16 vectors.

  • For that we need 10000/16=625 buckets.

  • We’re interested in 𝑛 number of planes, so that 2^𝑛=625. Now, we can calculate 𝑛=log²625=9.29≈10.

We will do it in the few “universes” with new random planes and add this numbers to previous vectors in the same hash table number to have more full picture of potential neighbours.

After all training vectors were assigned to hash tables we can try to find hash table of our transformed vector from English word by R matrix. And only after that we can find in this space it’s neighbour word in French by cosine similarity.

1 Like