Skip to content
#

attention-mechanism

Here are 1,519 public repositories matching this topic...

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

  • Updated Jul 4, 2024
  • Python
swarms

This is an unofficial Pytorch implementation of the Infini Attention mechanism introduced in the paper : "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention". Note that the official code for the paper has not been released yet. In case of issues, add a PR (add an explanation of the changes made and why so?)

  • Updated Jul 3, 2024
  • Python

Improve this page

Add a description, image, and links to the attention-mechanism topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the attention-mechanism topic, visit your repo's landing page and select "manage topics."

Learn more