Locality-sensitive hashing in Reformer model

user342 · February 6, 2022, 3:03am

For LSH in Reformer model (C4_W4 Reformer Efficient Attention: Ungraded Lab), could anyone explain the following mechanism for locality-sensitive hashing using pseudo-random rotations ? Note: There are some angular LSH animations which tried to do some explanations, but I am still confused.
Besides, why does the author mentioned : can store in space O(d) and evaluate in time O(d log d) using the Fast Hadamard Transform ?
As for LSH attention, why does it still need to check on previous chunk given that all the information had been rearranged into entirely different buckets which are not in ascending order ? and it seems to me that this LSH attention still needs to be auto-regressive ?

Angular locality-sensitive hashing 1

Angular locality-sensitive hashing 2

reinoudbosch · May 8, 2022, 3:01pm

Hi user342,

An answer to your first two questions requires an understanding of Fast Hadamard Transform, which is somewhat challenging and may be the result of a mathematical trial-and-error process rather than an entirely logical deduction.

The best explanation I have been able to find so far can be found here. The original values of a matrix are transformed by means of a generated Hadamard matrix, which is symmetric. Next, an FHT algorithm is applied as described in the post, serving to limit the number of calculations required to Nlog2N. This appears to be another transformation, which may well have been derived through mathematical trial-and-error (even if it may be ad hoc explainable one way or another after the fact, an explanation I am not aware of).

As to your third question, LSH attention needs to check on other chunks if these contain vectors with the same hash values. This is shown in figure 12, where other chunks are only checked if they have the same color code referring to hash values. This is done to keep the chunks at a single size. LSH attention only needs to be auto-regressive if it is concerned with causal attention; the discussion in the lab is about self-attention.

Topic		Replies	Views
Where is the LSHAttention model? NLP with Attention Models week-4	3	490	March 27, 2023
What about Reformer, can this architecture help? Generative AI with Large Language Models week-1	4	425	October 24, 2023
Should i dive deeper or not in LSH and approximate nearest neighbors NLP with Classification and Vector Spaces week-1	1	325	October 7, 2023
Week 4 - Ungraded Lab 1 - What exactly is happening in Part 3.4? Particularly in Step 5? NLP with Attention Models week-4	7	562	June 16, 2023
Reversible Transformer NLP with Attention Models week-4	1	395	September 19, 2023

Locality-sensitive hashing in Reformer model

Related topics