Version 0.42.0
- new model support: CohereForAI/aya-vision family of models
- new model support: AIDC-AI/Ovis2 family of models
- new model support: Qwen/Qwen2.5-VL family of models
- new model support: Qwen/QVQ-72B-Preview
- new model support: HuggingFaceM4/Idefics3-8B-Llama3
- compatibility: better backend auto detection for more flexible support of models by type
- bump torch to 2.5
- restrict requests to one at a time (no batching yet)
- REGRESSION: memory usage randomly seems to blow up with some models (qwen2/qwen2.5), this seems to be a new Qwen specific bug
- REGRESSION: GTPT-Int4/8 probably broken again
⚠️ DEPRECATED MODELS (use the0.41.0
docker image for support of these models): TIGER-Lab/Mantis, Ovis1.6-Gemma2-9B, Ovis1.6-Gemma2-27B, Ovis1.5-Gemma2-9B, allenai/Molmo, BAAI/Bunny, BAAI/Emu3-Chat, echo840/Monkey-Chat, failspy/Phi-3-vision-128k-instruct-abliterated-alpha, google/paligemma2, microsoft/Florence-2-large-ft, microsoft/Phi-3-vision, microsoft/Phi-3.5-vision, qnguyen3/nanoLLaVA, rhymes-ai/Aria