15 Nov 10:15

XprobeBot

4c96475

v1.0.0 Latest

Latest

What's new in 1.0.0 (2024-11-15)

These are the changes in inference v1.0.0.

New features

FEAT: Basic cancel support for image model by @codingl2k1 in #2528
FEAT: Add qwen2.5-coder 0.5B 1.5B 3B 14B 32B by @frostyplanet in #2543
FEAT: support kvcache in multi-round chat for MLX by @qinxuye in #2534

Enhancements

ENH: add normalize to rerank model by @hustyichi in #2509
ENH: Update fish audio by @codingl2k1 in #2555

Bug fixes

BUG: fix variant error for image model by @qinxuye in #2547

Documentation

DOC: Add paper citation by @luweizheng in #2533

Full Changelog: v0.16.3...v1.0.0

Contributors

frostyplanet, qinxuye, and 3 other contributors

Assets 2

08 Nov 05:47

XprobeBot

v0.16.3

85ab86b

v0.16.3

What's new in 0.16.3 (2024-11-08)

These are the changes in inference v0.16.3.

New features

feat: Add support for Llama 3.2-Vision models by @vikrantrathore in #2376

Enhancements

ENH: Display model name in process by @frostyplanet in #1891
REF: Remove replica total count in internal replica_model_uid by @ChengjieLi28 in #2516

Bug fixes

BUG: Compat with ChatTTS 0.2.1 by @codingl2k1 in #2520
BUG: transformers logs missing by @ChengjieLi28 in #2530

Full Changelog: v0.16.2...v0.16.3

Contributors

frostyplanet, vikrantrathore, and 2 other contributors

Assets 2

01 Nov 10:09

XprobeBot

v0.16.2

67e97ab

v0.16.2

What's new in 0.16.2 (2024-11-01)

These are the changes in inference v0.16.2.

New features

FEAT: add download from openmind_hub by @cookieyyds in #2504

Enhancements

BLD: Remove Python 3.8 & Support Python 3.12 by @ChengjieLi28 in #2503

Bug fixes

BUG: fix bge-reranker-v2-minicpm-layerwise rerank issue by @hustyichi in #2495

Documentation

DOC: modify NPU doc by @qinxuye in #2485
DOC: Add doc for ocr by @codingl2k1 in #2492

New Contributors

@hustyichi made their first contribution in #2495
@cookieyyds made their first contribution in #2504

Full Changelog: v0.16.1...v0.16.2

Contributors

qinxuye, hustyichi, and 3 other contributors

Assets 2

25 Oct 07:33

XprobeBot

v0.16.1

d4cd7b1

v0.16.1

What's new in 0.16.1 (2024-10-25)

These are the changes in inference v0.16.1.

New features

FEAT: Add support for Qwen/Qwen2.5-Coder-7B-Instruct gptq format by @frostyplanet in #2408
FEAT: Support GOT-OCR2_0 by @codingl2k1 in #2458
FEAT: [UI] Image model with the lora_config. by @yiboyasss in #2482
FEAT: added MLX support for Flux.1 by @qinxuye in #2459

Enhancements

ENH: Support ChatTTS 0.2 by @codingl2k1 in #2449
ENH: Pending queue for concurrent requests by @codingl2k1 in #2473

Bug fixes

BUG: Remove duplicated call of model_install by @frostyplanet in #2457
BUG: fix embedding model gte-Qwen2 dimensions by @JinCheng666 in #2479

Documentation

DOC: update enterprise doc links by @qinxuye in #2461

New Contributors

@JinCheng666 made their first contribution in #2479

Full Changelog: v0.16.0...v0.16.1

Contributors

frostyplanet, qinxuye, and 3 other contributors

Assets 2

18 Oct 11:40

XprobeBot

v0.16.0

5f7dea4

v0.16.0

What's new in 0.16.0 (2024-10-18)

These are the changes in inference v0.16.0.

New features

FEAT: Adding support for awq/gptq vLLM inference to VisionModel such as Qwen2-VL by @cyhasuka in #2445
FEAT: Dynamic batching for the state-of-the-art FLUX.1 text_to_image interface by @ChengjieLi28 in #2380
FEAT: added MLX for qwen2.5-instruct by @qinxuye in #2444

Enhancements

ENH: Speed up cli interaction by @frostyplanet in #2443
REF: Enable continuous batching for LLM with transformers engine by default by @ChengjieLi28 in #2437

Documentation

DOC: update readme & docs by @qinxuye in #2435

New Contributors

@cyhasuka made their first contribution in #2445

Full Changelog: v0.15.4...v0.16.0

Contributors

frostyplanet, qinxuye, and 2 other contributors

Assets 2

12 Oct 10:38

XprobeBot

v0.15.4

c0be115

v0.15.4

What's new in 0.15.4 (2024-10-12)

These are the changes in inference v0.15.4.

New features

FEAT: Llama 3.1 Instruct support tool call by @codingl2k1 in #2388
FEAT: qwen2.5 instruct tool call by @codingl2k1 in #2393
FEAT: add whisper-large-v3-turbo audio model by @hwzhuhao in #2409
FEAT: Add environment variable setting to increase the retry attempts after model download failures by @hwzhuhao in #2411
FEAT: support getting progress for image model by @qinxuye in #2395
FEAT: support qwenvl2 vllm engine by @amumu96 in #2428

Enhancements

ENH: Launch the ChatTTS model with kwargs by @codingl2k1 in #2425
REF: refactor controlnet for image model by @qinxuye in #2346

Bug fixes

BUG: Pin ChatTTS<0.2 by @codingl2k1 in #2419
BUG: tool call streaming output has duplicated list by @ChengjieLi28 in #2416

Full Changelog: v0.15.3...v0.15.4

Contributors

qinxuye, hwzhuhao, and 3 other contributors

Assets 2

30 Sep 13:42

XprobeBot

v0.15.3

00a9ee1

v0.15.3

What's new in 0.15.3 (2024-09-30)

These are the changes in inference v0.15.3.

New features

Feat: Support jina-embedding-v3 by @amumu96 in #2379
FEAT: Support deepcache with sd models by @frostyplanet in #2313
FEAT: support minicpm-reranker model by @hwzhuhao in #2383
FEAT: add vllm restart check and support internvl multi-image chat by @amumu96 in #2384

Bug fixes

BUG: [UI] Fix 'Model Format' bug on model registration page. by @yiboyasss in #2353
BUG: Fix default value of max_model_len for vLLM backend. by @zjuyzj in #2385

New Contributors

@zjuyzj made their first contribution in #2385
@hwzhuhao made their first contribution in #2383

Full Changelog: v0.15.2...v0.15.3

Contributors

frostyplanet, zjuyzj, and 3 other contributors

Assets 2

20 Sep 09:05

XprobeBot

v0.15.2

5de46e9

v0.15.2

What's new in 0.15.2 (2024-09-20)

These are the changes in inference v0.15.2.

New features

FEAT: Support Qwen 2.5 by @Jun-Howie in #2325
FEAT: support qwen2.5-coder-instruct and qwen2.5 sglang by @amumu96 in #2332

Bug fixes

BUG: [UI] Fix registration page bug. by @yiboyasss in #2315
BUG: Fix CosyVoice missing output by @codingl2k1 in #2320
BUG: support old register llm format by @amumu96 in #2335
BUG: fix stable diffusion from dify tool by @qinxuye in #2336

Documentation

DOC: update models for doc and readme by @qinxuye in #2330

Full Changelog: v0.15.1...v0.15.2

Contributors

qinxuye, Jun-Howie, and 3 other contributors

Assets 2

14 Sep 07:38

XprobeBot

v0.15.1

4c5e752

v0.15.1

What's new in 0.15.1 (2024-09-14)

These are the changes in inference v0.15.1.

New features

FEAT: Support qwen2-vl-instruct GPTQ format and AWQ format by @Jun-Howie in #2251
FEAT: Support minicpm-4B by @Jun-Howie in #2263
FEAT: support sdapi/txt2img by @qinxuye in #2248
FEAT: [UI] Auto-fill chat_template parameter on registration page. by @yiboyasss in #2268
FEAT: support sdapi/sd-models and sdapi/samplers by @qinxuye in #2288
FEAT: support deepseek-v2 and 2.5 by @amumu96 in #2292
FEAT: Update Qwen2-VL-Model to support flash_attention_2 implementation by @LaureatePoet in #2289
FEAT: support sdapi/img2img by @qinxuye in #2293
FEAT: support flux.1 image2image and inpainting by @qinxuye in #2296
FEAT: Support yi-coder-chat by @Jun-Howie in #2302
FEAT: qwen2 audio by @codingl2k1 in #2271

Enhancements

ENH: Update CosyVoice Huggingface by @codingl2k1 in #2249
ENH: Supports multi functions in tool call for qwen2 by @ChengjieLi28 in #2265
ENH: add print-error option in benchmark by @Dawnfz-Lenfeng in #2283
ENH: Support fish speech 1.4 by @codingl2k1 in #2295

Bug fixes

BUG: tts stream mode not working by @leslie2046 in #2279
BUG: fix issue with model launch failing when .safetensors file is missing (#2094) by @Charmnut in #2290
BUG: fix sampler_name for img2img by @qinxuye in #2301
BUG: modify vllm image version by @amumu96 in #2311
Bug: modify vllm image version by @amumu96 in #2312

Documentation

DOC: update readme & builtin models by @qinxuye in #2285

New Contributors

@Jun-Howie made their first contribution in #2251
@leslie2046 made their first contribution in #2279
@Charmnut made their first contribution in #2290
@LaureatePoet made their first contribution in #2289

Full Changelog: v0.15.0...v0.15.1

Contributors

qinxuye, leslie2046, and 8 other contributors

Assets 2

06 Sep 08:45

XprobeBot

v0.15.0

e2618be

v0.15.0

What's new in 0.15.0 (2024-09-06)

These are the changes in inference v0.15.0.

New features

FEAT: cosyvoice model support streaming reply by @wuminghui-coder in #2192
FEAT: support qwen2-vl-instruct by @Minamiyama in #2205

Enhancements

ENH: include openai-whisper into thirdparty by @qinxuye in #2232
ENH: MiniCPM-V-2.6 Supports continuous batching with transformers engine by @ChengjieLi28 in #2238
ENH: unpad for image2image/inpainting model by @wxiwnd in #2229
ENH: Refine request log and add optional request_id by @frostyplanet in #2173
REF: Use chat_template for LLM instead of prompt_style by @ChengjieLi28 in #2193

Bug fixes

BUG: Fix docker image startup issue due to entrypoint by @ChengjieLi28 in #2207
BUG: fix init xinference fail when custom path is fault by @amumu96 in #2208
BUG: use default_uid to replace uid of actors which may override the xoscar actor's uid property by @qinxuye in #2214
BUG: fix rerank max length by @qinxuye in #2219
BUG: logger bug of function using generator decoration by @wxiwnd in #2215
BUG: fix rerank calculation of tokens number by @qinxuye in #2228
BUG: fix embedding token calculation & optimize memory by @qinxuye in #2221

Documentation

DOC: Modify the installation documentation to change single quotes to double quotes for Windows compatibility. by @nikelius in #2211

Others

Revert "EHN: clean cache for VL models (#2163)" by @qinxuye in #2230
CHORE: Docker image is only pushed to aliyun when releasing version by @ChengjieLi28 in #2216
CHORE: Compatible with openai >= 1.40 by @ChengjieLi28 in #2231

New Contributors

@nikelius made their first contribution in #2211
@wuminghui-coder made their first contribution in #2192

Full Changelog: v0.14.4...v0.15.0

Contributors

frostyplanet, qinxuye, and 6 other contributors

Assets 2

Releases: xorbitsai/inference

v1.0.0

What's new in 1.0.0 (2024-11-15)

New features

Enhancements

Bug fixes

Documentation

Contributors

v0.16.3

What's new in 0.16.3 (2024-11-08)

New features

Enhancements

Bug fixes

Contributors

v0.16.2

What's new in 0.16.2 (2024-11-01)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

v0.16.1

What's new in 0.16.1 (2024-10-25)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

v0.16.0

What's new in 0.16.0 (2024-10-18)

New features

Enhancements

Documentation

New Contributors

Contributors

v0.15.4

What's new in 0.15.4 (2024-10-12)

New features

Enhancements

Bug fixes

Contributors

v0.15.3

What's new in 0.15.3 (2024-09-30)

New features

Bug fixes

New Contributors

Contributors

v0.15.2

What's new in 0.15.2 (2024-09-20)

New features

Bug fixes

Documentation

Contributors

v0.15.1

What's new in 0.15.1 (2024-09-14)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

v0.15.0

What's new in 0.15.0 (2024-09-06)

New features

Enhancements

Bug fixes

Documentation

Others

New Contributors

Contributors