Why do we need Multiple Planes?

I didn’t understand this part at all. We just ended simple hashing by remaining of deviding. So, what are this planes at all? As I understand we dedicate some area of values of the same category by this planes. And depending of signs of np.dot(P​, v.T) we understand is vector v part of this area. What is P, by the way? As I see it is projection. But how do we get it?

Hi @someone555777

We use use planes to divide up the space so that searching for nearest neighbors would be more efficient - we are limiting search only to the points that are on the same divided space (we do not compare every point with every other point, hence the efficiency).

I believe P stands here for “plane” which is defined as normal vector. This normal vector represents the plane (normal vector n of the plane \pi is the vector to which all vectors in the plane \pi are perpendicular). By dot product (n \cdot v) you know which side of this plane \pi the vector v points. (vector by vector dot product results in a scalar). With some additional steps (explained in lectures) you construct hashes - the spaces we limit our search for nearest neighbors.

Cheers

P.S. how many spaces multiple planes can create.

emm, ok, thank you for an answer. But should I understand this at all? As I understand that it is only one of techniques to cluster words with similar meaning? And not very often used?

:slight_smile: You will need it later in the course. And the general answer depends on your goals in learning (from this Course and overall).
If, for example, you’re just poking around and plan using prompt engineering or you’re some business/marketing person, then you can get away without understanding it. But if you plan to continue learning ML, this is a tool in your toolbox.

Technically it is not to cluster but to limit the search space.

It depends… It’s not the most used technique, but it’s a useful one if you have limited resources (small business/researcher).

Cheers

and one more question. If I understand correct, we find the side of plane of the vector, not only any of dots. So, how is it visualizated at all? As I understand vector can cross the plane in most part of cases. Should we orient by side on which most part of this vector is?

and one more question, do I understand correct, that hash_multi_plane_matrix(random_planes_matrix, v, num_planes) from lab “Hash tables” that takes 3 planes can output only one of 1-7 digits? That are potential places in plot that are formed by crossing of this 3 lines?

studentui2

  • Each dot is a vector (an instance of data, green-positive, red-negative)
  • Each colored line is a plane (random plane dividing data)

After this division you limit the search of the nearest points (vectors) in between the colored lines.

here are dots = vectors, for example. Why any of this dots were uncotegorized by the way?

and how do I get this separation planes by the way? Is it something like gradient decent?

By random, they are not learned (no gradient descent).

emm, how is it? why do we need them at all if they are random? do they help much in clusterisation?

And what about my question about screen from the course?

Did you see my previous answer?

Can you see how the search is reduced to the points/vectors between the colored lines? We might still have to compare 1000 with other 1000 points/vectors (1000x1000 = 1 000 000 comparisons) but it’s still way less than 6 000 with other 6000 (6000x6000 = 36 000 000) *note these number are made up, we would need to divide more that into 6 spaces.

The illustration from the course is not very representative so I did not comment. The main point there is what is stated in words - approximation - does not lead to nearest neighbor every time, but it works in practice (especially when creating some different “universes” (later in the course - randomly dividing to some number of planes not once, but couple of times which statistically leads to absolute nearest neighbors).

Yes, I see how planes separate dots that are vectors and 90% relatively my screen, but I still can’t understand why do we need this multiple planes. As I understand this part of lab should explain me this, but I still doesn’t fully undersand this

Ok, by hash id I can get elements more quick. But why? And relatevely of what? Of computation one of synonims of english word by k-Nearest Neighbors Algorithm or what? How can help simple rude random separation of data by planes in this process?

Ok, take a look at this picture:

studentui2

Red, grey and blue lines are totally random, but they divide the space into smaller regions. This makes your search constrained only between these (random) lines - that saves you a lot of comparing (to points that are outside this region).

Each region here could be marked as b0, b1, b2, b3, b4 and b5 (in this picture there are 6 regions, coordinate axis (in thinner black) do not divide the space).

Does that make clearer how random division of space can reduce your search space?

hmm, ok. So, the value of hash is encoded information of which zone it is? And what does it do? For example, it is meanings of english words. Are separations of french zones the same as english? And we can quickly compute it by the same principles if we only know english word vector?

It helps you to not search everywhere.

It is not is used in this way.

After “translation” (transformation) of English word, you get a lot of French words that could be a good translation. So to limit your search - you divide the space randomly and search only in the region where the “translation” “landed”.

so, are random lines the same for english and french data?

It doesn’t matter, they are random anyway.
You can have one set of lines for English and one set of lines for French, or you can use the same set of lines for both of them. The main idea is that it limits what are you comparing in that space (English or French)

so, is all the idea to get elements by they ids, which contains hash. And that hashes can be transformed in one of parts that planes are formed?

So, is it maybe even not very connected specially with transforming of form English to French words matrixes?

I’m sorry, your English is just too crude for me. No offence, by I don’t understand the questions.
Can you rephrase them?