A simple implementation of the LSH Attention in Reformer

Description

Calculate Softmax layer of Attention in $O(L\log L)(L=sequence length)$ instead of $O(L^2)$ using the cross-polytope Locality-Sensitive Hashing. For more detail, look at this paper.

Usage

You only need numpy >=1.18 .

For example,

import numpy as np

from functions import normal_softmax, lsh_softmax

R = np.random.randn(100, 10000)
normal_sm = normal_softmax(R)
lsh_sm = lsh_softmax(R)

Test

Small size

Note: For better visibility, the diagonal components are rewritten to 0

Time complexity analysis

The execution times are plotted for sequence lengths of $2^i$($i=4, 5, \cdots , 15$).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

A simple implementation of the LSH Attention in Reformer

Description

Usage

Test

Small size

Time complexity analysis

Files

README.md

Latest commit

History

README.md

File metadata and controls

A simple implementation of the LSH Attention in Reformer

Description

Usage

Test

Small size

Time complexity analysis