Releases: opendatahub-io/vllm-tgis-adapter
Releases · opendatahub-io/vllm-tgis-adapter
0.5.3
What's Changed
- 🐛 handle MistralTokenizer special case by @prashantgupta24 in #162
Full Changelog: 0.5.2...0.5.3
0.5.2
What's Changed
- gha: fix caching strategy by @dtrifiro in #154
- build(deps): bump prometheus-client from 0.20.0 to 0.21.0 by @dependabot in #151
- build(deps): bump types-protobuf from 5.27.0.20240907 to 5.28.0.20240924 by @dependabot in #150
- ✨ add pt_to_prompt cli by @prashantgupta24 in #155
- Exploit vLLM options to return deltas/final-output only by @njhill in #137
- build(deps): bump ruff from 0.6.7 to 0.6.9 by @dependabot in #157
- Fix bug in example. by @tdoublep in #159
- pre-commit autoupdate by @github-actions in #142
- fix linter complaints, cleanup gha caching by @dtrifiro in #160
- gha: add ccache by @dtrifiro in #147
- build(deps): bump grpcio from 1.62.2 to 1.66.2 by @dependabot in #153
- Fail startup with root-cause exception by @njhill in #156
- ✨ invoke caikit -> peft conversion at load time by @joerunde in #161
New Contributors
Full Changelog: 0.5.1...0.5.2
0.5.1
0.5.0
What's Changed
- pre-commit autoupdate by @github-actions in #104
- build(deps): bump ruff from 0.6.1 to 0.6.3 by @dependabot in #116
- gha: use vllm v0.6.1 for testing by @dtrifiro in #125
- Add OWNER file by @vaibhavjainwiz in #124
- build(deps): bump mypy from 1.11.1 to 1.11.2 by @dependabot in #107
- build(deps): bump accelerate from 0.33.0 to 0.34.2 by @dependabot in #123
- build(deps): bump types-protobuf from 5.26.0.20240422 to 5.27.0.20240907 by @dependabot in #120
- build(deps): update opentelemetry-api requirement from <1.27.0,>=1.26.0 to >=1.26.0,<1.28.0 by @dependabot in #122
- pre-commit autoupdate by @github-actions in #119
- do not run codecov upload step on dependabot PR by @NickLucche in #129
- Write to /dev/termination-log on main loop exception by @NickLucche in #118
- http_server: compatibility fixes for vllm>0.6.1.post1 by @dtrifiro in #136
- Propagate cancellation of gRPC requests by @njhill in #130
- pre-commit autoupdate by @github-actions in #127
- Fix calls to defunct AsyncEngineClient by @NickLucche in #138
- 🐛 fix nargs + or * by @joerunde in #140
- build(deps): bump pytest from 8.3.2 to 8.3.3 by @dependabot in #131
- build(deps): update opentelemetry-sdk requirement from <1.27.0,>=1.26.0 to >=1.26.0,<1.28.0 by @dependabot in #133
- build(deps): bump ruff from 0.6.3 to 0.6.7 by @dependabot in #139
- grpc_server: use x-correlation-id as request-id when possible by @dtrifiro in #128
- build(deps): bump types-requests from 2.32.0.20240712 to 2.32.0.20240914 by @dependabot in #132
- build(deps): update opentelemetry-exporter-otlp requirement from <1.27.0,>=1.26.0 to >=1.26.0,<1.28.0 by @dependabot in #135
- deps: bump vllm minimum version to 0.6.2 by @dtrifiro in #143
New Contributors
- @github-actions made their first contribution in #104
- @vaibhavjainwiz made their first contribution in #124
- @NickLucche made their first contribution in #129
Full Changelog: 0.4.1...0.4.2
0.4.1
What's Changed
- build(deps): bump ruff from 0.5.5 to 0.6.1 by @dependabot in #100
- build(deps): bump accelerate from 0.32.1 to 0.33.0 by @dependabot in #93
- fix gha pre commit autoupdate by @dtrifiro in #103
- pyproject: Pin OpenTelemetry versions by @ronensc in #106
- fix: handle negative token rank when logprobs disabled by @tjohnson31415 in #105
- ✨ add x-correlation-id in gRPC metadata by @prashantgupta24 in #113
- ♻️ updates for vLLM==0.5.5 by @prashantgupta24 in #112
Full Changelog: 0.4.0...0.4.1
0.4.0
What's Changed
- fix: consistent env var parsing for all boolean args in vLLM by @tjohnson31415 in #98
- 🐛 fix input text issue by @prashantgupta24 in #97
Full Changelog: 0.3.0...0.4.0
0.3.0
What's new
-
Add model-util CLI by @rafvasq in #59:
Adds commands to the adapter using either entry point
model-util
ortext-generation-server
-model-util download-weights
-model-util convert-to-safetensors
-model-util convert-to-fast-tokenizer
-
add
--disable-prompt-logprobs
argument by @tjohnson31415 in #95
What's Changed
- Revert "fix
merge_async_iterators
usage for vLLM>0.5.4" by @dtrifiro in #88 - fix parsing of env vars by @dtrifiro in #94
- gha: add pre-commit autoupdate workflow by @dtrifiro in #89
- fix: correct parsing of command line bool arg values by @tjohnson31415 in #96
New Contributors
- @tjohnson31415 made their first contribution in #95
- @rafvasq made their first contribution in #59
Full Changelog: 0.2.4...0.3.0
0.2.4
Highlights
- Compatibility with vLLM v0.5.4
- Bump minimum vLLM requirement to v0.5.4
What's Changed
- remove dead code (vllm<=0.5.0.post1) by @dtrifiro in #65
- build(deps): bump ruff from 0.5.4 to 0.5.5 by @dependabot in #72
- pyproject: fix broken URLs by @dtrifiro in #76
- gha: add missing build dependencies by @dtrifiro in #68
- noxfile: make overridden vllm version install verbose by @dtrifiro in #78
- updates for vLLM==0.5.4 by @dtrifiro in #82
- fix merge_async_iterators usage for vLLM>0.5.4 by @dtrifiro in #86
- build(deps): bump hf-transfer from 0.1.6 to 0.1.8 by @dependabot in #73
- build(deps): bump flash-attn from 2.6.1 to 2.6.3 by @dependabot in #71
- pre-commit: bump deps by @dtrifiro in #67
- Extract request's trace context in
GenerateStream()
by @ronensc in #64 - Make ADD_SPECIAL_TOKENS true by default by @maxdebayser in #66
- build(deps): bump pytest from 8.2.2 to 8.3.2 by @dependabot in #70
- Fix
stop_reason
for secondary eos tokens by @njhill in #75 - ♻️ refactor LoRARequest based on upstream by @prashantgupta24 in #74
- build(deps): bump mypy from 1.10.1 to 1.11.1 by @dependabot in #77
- Fix length penalty logit processor getting NaN by @wallashss in #85
- Re-enable seed for pipeline parallel by @njhill in #69
- 🥅 Kill servers on engine death by @joerunde in #63
New Contributors
- @ronensc made their first contribution in #64
- @maxdebayser made their first contribution in #66
- @wallashss made their first contribution in #85
Full Changelog: 0.2.3...0.2.4
0.2.3
What's Changed
- build(deps): bump accelerate from 0.31.0 to 0.32.1 by @dependabot in #42
- build(deps): bump flash-attn from 2.5.9.post1 to 2.6.1 by @dependabot in #45
- Fix earlier LoRA tokenizer changes by @njhill in #53
- build(deps): bump ruff from 0.5.1 to 0.5.2 by @dependabot in #43
- build(deps): bump types-requests from 2.32.0.20240602 to 2.32.0.20240712 by @dependabot in #44
- Fix tokenize endpoint by @njhill in #54
- build(deps): bump ruff from 0.5.2 to 0.5.4 by @dependabot in #58
- vLLM 5.3+ support by @joerunde in #60
New Contributors
Full Changelog: 0.2.2...0.2.3