Skip to content

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support)#4837

Merged
njhill merged 68 commits intovllm-project:mainfrom neuralmagic:afeldman-nm/infra_enc_dec_block_managerMay 29, 2024

Commits

Commits on May 16, 2024

Commits on May 22, 2024

Commits on May 28, 2024

Commits on May 29, 2024