Releases: lightonai/vllm
Releases · lightonai/vllm
v0.5.3.post1
v0.5.1
v0.5.0.post1
v0.4-custom.2.dev.5
Add max_num_seqs to deployment script
v0.4-custom.2.dev.2
- Allow for more punica kernel shapes, including a LoRA B which goes to an output size of 64
- Other minor improvements (ed41649, b8b6b1e, etc...)
v0.4-custom.2.dev.1
v0.4-custom.1
- Falcon/Alfred LoRA support
- Read LoRAs params from env vars
- Add SageMaker support for LoRAs
- Add the
/loras
endpoint to add LoRAs while the server is running
v0.4-canary.3
- Read LoRAs params from env vars
v0.4-canary.2
- Add SageMaker support for LoRAs
- Add the
/loras
endpoint to add LoRAs while the server is running