[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support)#4837
Merged
njhill merged 68 commits intovllm-project:mainfrom neuralmagic:afeldman-nm/infra_enc_dec_block_managerMay 29, 2024
Commits
Commits on May 15, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on May 16, 2024
Commits on May 17, 2024
Commits on May 22, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on May 23, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on May 24, 2024
Commits on May 26, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on May 28, 2024
- committed
- committed
- committed
- committed
Commits on May 29, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed