Skip to content

v0.12.2

Compare
Choose a tag to compare
@XprobeBot XprobeBot released this 21 Jun 09:14
· 246 commits to main since this release
5cef7c3

What's new in 0.12.2 (2024-06-21)

These are the changes in inference v0.12.2.

New features

  • FEAT: Add Tools Support for Qwen Series MOE Models by @zhanghx0905 in #1642
  • FEAT: [UI]Modify the deletion function of a custom model. by @yiboyasss in #1656
  • FEAT: [UI]Custom model presents JSON data and modifies it. by @yiboyasss in #1670
  • FEAT: Add Rerank model token input/output usage by @wxiwnd in #1657

Enhancements

  • ENH: Continuous batching supports all the models with transformers backend by @ChengjieLi28 in #1659

Bug fixes

  • BUG: show error when user launch quantized model without device supported by @Minamiyama in #1645
  • BUG: Fix default rerank type by @codingl2k1 in #1649
  • BUG: chat_completion not response while error appears more than 100 by @liuzhenghua in #1663

Tests

Others

Full Changelog: v0.12.1...v0.12.2