Implementation of Block Recurrent Transformer - Pytorch
-
Updated
Aug 20, 2024 - Python
Implementation of Block Recurrent Transformer - Pytorch
Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
RAN: Recurrent Attention Networks for Long-text Modeling | Findings of ACL23
A repository to get train transformers to access longer context for causal language models, most of these methods are still in testing. Try them out if you'd like but please lmk your results so we don't duplicate work :)
Add a description, image, and links to the long-context-transformers topic page so that developers can more easily learn about it.
To associate your repository with the long-context-transformers topic, visit your repo's landing page and select "manage topics."