Skip to content

Releases: mudler/LocalAI

v2.26.0

15 Feb 17:22
Compare
Choose a tag to compare


🦙 LocalAI v2.26.0!

Hey everyone - very excited about this release!

It contains several cleanups, performance improvements and few breaking changes: old backends that are now superseded have been removed (for example, vall-e-x), while new backends have been added to expand the range of model architectures that LocalAI can support. While most of the changes are tested, if you encounter issues with the new backends or migrated ones please file a new issue.

We also now have support for Nvidia L4T devices (for example, Nvidia AGX Orin) with specific container images. See the documentation for more details.

⚠️ Breaking Changes ⚠️

  • Several backends have been dropped and replaced for improved performance and compatibility.
  • Vall-e-x and Openvoice were deprecated and dropped.
  • The stablediffusion-NCN backend was replaced with the stablediffusion-ggml implementation.
  • Deprecated llama-ggml backend has been dropped in favor of GGUF support.
Check all details!

Backends that were dropped:

  • Vall-e-x and Openvoice: These projects went silent, and there are better alternatives now. They have been completely superseded by the CoquiTTS community fork, Kokoro, and OutelTTS.
  • Stablediffusion-NCN: This was the first variant shipped with LocalAI based on the ONNX runtime. It has now been superseded by the stablediffusion-ggml backend, which offers similar capabilities and wider support across more architectures.
  • Llama-ggml backend: This was the pre-GGUF backend, which is now deprecated. Moving forward, LocalAI will support only GGUF models.

Notable Backend Changes:

  • Mamba has moved to the transformers backend.
  • Transformers-Musicgen has moved to the transformers backend.
  • Sentencetransformers has moved to the transformers backend.

While LocalAI will try to alias to the transformers backend automatically when using these backends, there might be incompatibilies with your configuration files. Please open an issue if you face any problem!

New Backends:

  • Kokoro (TTS): A new backend for text-to-speech.
  • OuteTTS: A TTS backend with voice cloning capabilities.
  • Fast-Whisper: A backend designed for faster whisper model inference.

New Features 🎉

  • Lazy grammars (llama.cpp): Added grammar triggers for llama.cpp: this allow models trained with specific tokens to enable grammar generation when such tokens are seen: this allows precise JSON generation but also consistent output when the model does not need to answer with a tool. For example, in the config file of the model triggers can be specified as such:
  function:
    grammar:
      triggers:
        word: "<tool_call>"
        at_start: true
  • Function Argument Parsing Using Named Regex: A new feature that allows parsing function arguments with named regular expressions, simplifying function calls.
  • Support for New Backends: Added Kokoro, OutelTTS, and Fast-Whisper backends.
  • Diffusers Update: Added support for Sana pipelines and image generation option overrides.
  • Machine Tag and Inference Timing: Allows tracking machine performance during inference.
  • Tokenization: Introduced tokenization support for llama.cpp to improve text processing.
  • AVX512: There is now bundled support for CPUs supporting AVX512 instruction set
  • Nvidia L4T: Support for Nvidia devices on arm64, for example Nvidia AGX Orin and alikes. See the documentation. TLDR; You can start container images ready to go with:
docker run -e DEBUG=true \
                    -p 8080:8080 \
                    -v $PWD/models:/build/models  \
                   -ti --restart=always --name local-ai \
                   --runtime nvidia --gpus all quay.io/go-skynet/local-ai:master-nvidia-l4t-arm64-core

Bug Fixes 🐛

  • Multiple fixes to improve stability, including enabling SYCL support for stablediffusion-ggml and consistent OpenAI stop reason returns.
  • Improved context shift handling for llama.cpp and fixed gallery store overrides.

🧠 Models:



I've fine-tuned a family of models based on o1-cot and function call datasets to work closely with all LocalAI features regarding function calling. The models are tailored to be conversational and execute function calls:

Enjoy! All the models are available in the LocalAI gallery:

local-ai run LocalAI-functioncall-phi-4-v0.3
local-ai run LocalAI-functioncall-llama3.2-1b-v0.4
local-ai run LocalAI-functioncall-llama3.2-3b-v0.5
local-ai run localai-functioncall-qwen2.5-7b-v0.5

Other models

Numerous model updates and additions:

  • New models like nightwing3-10b, rombos-qwen2.5-writer, and negative_llama_70b.
  • Updated checksum for model galleries.
  • Added icons and improved prompt templates for various models.
  • Expanded model gallery with new additions like DeepSeek-R1, Mistral-small-24b, and more.

Full changelog 👇

👉 Click to expand 👈

Breaking Changes 🛠

  • chore(vall-e-x): Drop backend by @mudler in #4619
  • feat(transformers): merge musicgen functionalities to a single backend by @mudler in #4620
  • feat(transformers): merge sentencetransformers backend by @mudler in #4624
  • chore(stablediffusion-ncn): drop in favor of ggml implementation by @mudler in #4652
  • feat(transformers): add support to Mamba by @mudler in #4669
  • chore(openvoice): drop backend by @mudler in #4673
  • chore: drop embedded models by @mudler in #4715
  • chore(llama-ggml): drop deprecated backend by @mudler in #4775
  • fix(llama.cpp): disable mirostat as default by @mudler in #2911

Bug fixes 🐛

  • fix(stablediffusion-ggml): correctly enable sycl by @mudler in #4591
  • fix(stablediffusion-ggml): enable oneapi before build by @mudler in #4593
  • fix(docs): add missing -core suffix to sycl images by @M0Rf30 in #4630
  • fix(stores): Stores fixes and testing by @richiejp in #4663
  • fix(gallery): do not return overrides and additional config by @mudler in #4768
  • fix(openai): consistently return stop reason by @mudler in #4771
  • fix(llama.cpp): improve context shift handling by @mudler in #4820

Exciting New Features 🎉

🧠 Models

  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #4580
  • chore(model gallery): add nightwing3-10b-v0.1 by @mudler in #4582
  • chore(model gallery): add qwq-32b-preview-ideawhiz-v1 by @mudler in #4583
  • chore(...
Read more

v2.25.0

10 Jan 22:02
07655c0
Compare
Choose a tag to compare

What's Changed

Bug fixes 🐛

Exciting New Features 🎉

  • feat(llama.cpp): expose cache_type_k and cache_type_v for quant of kv cache by @mudler in #4329
  • feat(template): read jinja templates from gguf files by @mudler in #4332
  • feat: stream tokens usage by @mintyleaf in #4415
  • feat(Dockerfile): allow to skip driver installation by @mudler in #4447
  • feat(ui): path prefix support via HTTP header by @mgoltzsche in #4497
  • feat(dowloader): resume partial downloads by @Saavrm26 in #4537

🧠 Models

  • chore(model gallery): add rp-naughty-v1.0c-8b by @mudler in #4322
  • chore(model gallery): add loki-v2.6-8b-1024k by @mudler in #4321
  • chore(model gallery): add math-iio-7b-instruct by @mudler in #4323
  • chore(model gallery): add llama-3.3-70b-instruct by @mudler in #4333
  • chore(model gallery): add mn-chunky-lotus-12b by @mudler in #4337
  • chore(model gallery): add virtuoso-small by @mudler in #4338
  • chore(model gallery): add bio-medical-llama-3-8b by @mudler in #4339
  • chore(model gallery): add qwen2.5-7b-homeranvita-nerdmix by @mudler in #4343
  • chore(model gallery): add impish_mind_8b by @mudler in #4344
  • chore(model gallery): add tulu-3.1-8b-supernova-smart by @mudler in #4347
  • chore(model gallery): add qwen2.5-math-14b-instruct by @mudler in #4355
  • chore(model gallery): add intellect-1-instruct by @mudler in #4356
  • chore(model gallery): add b-nimita-l3-8b-v0.02 by @mudler in #4357
  • chore(model gallery): add sailor2-1b-chat by @mudler in #4363
  • chore(model gallery): add sailor2-8b-chat by @mudler in #4364
  • chore(model gallery): add sailor2-20b-chat by @mudler in #4365
  • chore(model gallery): add 72b-qwen2.5-kunou-v1 by @mudler in #4369
  • chore(model gallery): add deepthought-8b-llama-v0.01-alpha by @mudler in #4370
  • chore(model gallery): add l3.3-70b-euryale-v2.3 by @mudler in #4371
  • chore(model gallery): add l3.3-ms-evayale-70b by @mudler in #4374
  • chore(model gallery): add evathene-v1.3 by @mudler in #4375
  • chore(model gallery): add hermes-3-llama-3.2-3b by @mudler in #4376
  • chore(model gallery): add fusechat-gemma-2-9b-instruct by @mudler in #4379
  • chore(model gallery): add fusechat-qwen-2.5-7b-instruct by @mudler in #4380
  • chore(model gallery): add chronos-gold-12b-1.0 by @mudler in #4381
  • fix: correct gallery/index.yaml by @godsey in #4384
  • chore(model gallery): add fusechat-llama-3.2-3b-instruct by @mudler in #4386
  • chore(model gallery): add fusechat-llama-3.1-8b-instruct by @mudler in #4387
  • chore(model gallery): add neumind-math-7b-instruct by @mudler in #4388
  • chore(model gallery): add naturallm-7b-instruct by @mudler in #4392
  • chore(model gallery): add marco-o1-uncensored by @mudler in #4393
  • chore(model gallery): add qwen2-7b-multilingual-rp by @mudler in #4394
  • chore(model gallery): add qwq-lcot-7b-instruct by @mudler in #4419
  • chore(model gallery): add llama-openreviewer-8b by @mudler in #4422
  • chore(model gallery): add falcon3-1b-instruct by @mudler in #4423
  • chore(model gallery): add falcon3-3b-instruct by @mudler in #4424
  • chore(model gallery): add qwen2-vl-72b-instruct by @mudler in #4425
  • chore(model gallery): add falcon3-10b-instruct by @mudler in #4426
  • chore(model gallery): add llama-song-stream-3b-instruct by @mudler in #4431
  • chore(model gallery): add llama-chat-summary-3.2-3b by @mudler in #4432
  • chore(model gallery): add tq2.5-14b-aletheia-v1 by @mudler in #4440
  • chore(model gallery): add tq2.5-14b-neon-v1 by @mudler in #4441
  • chore(model gallery): add orca_mini_v8_1_70b by @mudler in #4444
  • chore(model gallery): add anubis-70b-v1 by @mudler in #4446
  • chore(model gallery): add llama-3.3-70b-instruct-ablated by @mudler in #4448
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #4487
  • chore(model gallery): add l3.3-ms-evalebis-70b by @mudler in #4488
  • chore(model gallery): add tqwendo-36b by @mudler in #4489
  • chore(model gallery): add rombos-llm-70b-llama-3.3 by @mudler in #4490
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #4492
  • chore(model gallery): add fastllama-3.2-1b-instruct by @mudler in #4493
  • chore(model gallery): add dans-personalityengine-v1.1.0-12b by @mudler in #4494
  • chore(model gallery): add llama-3.1-8b-open-sft by @mudler in #4495
  • chore(model gallery): add qvq-72b-preview by @mudler in #4498
  • chore(model gallery): add teleut-7b-rp by @mudler in #4499
  • chore(model gallery): add falcon3-1b-instruct-abliterated by @mudler in #4501
  • chore(model gallery): add falcon3-3b-instruct-abliterated by @mudler in #4502
  • chore(model gallery): add falcon3-10b-instruct-abliterated by @mudler in #4503
  • chore(model gallery): add falcon3-7b-instruct-abliterated by @mudler in #4504
  • chore(model gallery): add control-nanuq-8b by @mudler in #4506
  • chore(model gallery): add miscii-14b-1028 by @mudler in #4507
  • chore(model gallery): add miscii-14b-1225 by @mudler in #4508
  • chore(model gallery): add qwen2.5-32b-rp-ink by @mudler in #4517
  • chore(model gallery): add huatuogpt-o1-8b by @mudler in #4518
  • chore(model gallery): add q2.5-veltha-14b-0.5 by @mudler in #4519
  • chore(model gallery): add smallthinker-3b-preview by @mudler in #4521
  • chore(model gallery): add mn-12b-mag-mell-r1-iq-arm-imatrix by @mudler in #4522
  • chore(model gallery): add captain-eris-diogenes_twilight-v0.420-12b by @mudler in #4523
  • chore(model gallery): add violet_twilight-v0.2 by @mudler in #4524
  • chore(model gallery): add qwenwify2.5-32b-v4.5 by @mudler in #4525
  • chore(model gallery): add sainemo-remix by @mudler in #4526
  • chore(model gallery): add l3.1-purosani-2-8b by @mudler in #4527
  • chore(model gallery): add nera_noctis-12b by @mudler in #4530
  • chore(model gallery): add drt-o1-7b by @mudler in #4533
  • chore(model gallery): add codepy-deepthink-3b by @mudler in #4534
  • chore(model gallery): add llama3.1-8b-prm-deepseek-data by @mudler in #4535
  • chore(model gallery): add experimental-lwd-mirau-rp-14b-iq-imatrix by @mudler in #4539
  • chore(model gallery): add llama-deepsync-3b by @mudler in #4540
  • chore(model gallery): add qwentile2.5-32b-instruct by @mudler in #4541
  • chore(model gallery): add 32b-qwen2.5-kunou-v1 by @mudler in #4545
  • chore(model gallery): add triangulum-10b by @mudler in #4546
  • chore(model gallery): add 14b-qwen2.5-kunou-v1 by @mudler in #4547
  • chore(model gallery): add dolphin3.0-llama3.1-8b by @mudler in https://github.com/mudl...
Read more

v2.24.2

10 Dec 14:52
59cf30a
Compare
Choose a tag to compare

What's Changed

👒 Dependencies

  • chore: ⬆️ Update ggerganov/llama.cpp to 26a8406ba9198eb6fdd8329fa717555b4f77f05f by @mudler in #4358

Full Changelog: v2.24.1...v2.24.2

v2.24.1

08 Dec 16:53
Compare
Choose a tag to compare

This is a patch release to fix #4334

Full Changelog: v2.24.0...v2.24.1

v2.24.0

04 Dec 20:56
87b7648
Compare
Choose a tag to compare

LocalAI release v2.24.0!

b642257566578

🚀 Highlights

  • Backend deprecation: We’ve removed rwkv.cpp and bert.cpp, replacing them with enhanced functionalities in llama.cpp for simpler installation and better performance.
  • New Backends Added: Introducing bark.cpp for text-to-audio and stablediffusion.cpp for image generation, both powered by the ggml framework.
  • Voice Activity Detection (VAD): Added support for silero-vad to detect speech in audio streams.
  • WebUI Improvements: Now supports API key authentication for enhanced security.
  • Real-Time Token Usage: Monitor token consumption during streamed outputs.
  • Expanded P2P Settings: Greater flexibility with new configuration options like listen_maddrs, dht_announce_maddrs, and bootstrap_peers.

📤 Backends Deprecation

As part of our cleanup efforts, the rwkv.cpp and bert.cpp backends have been deprecated. Their functionalities are now integrated into llama.cpp, offering a more streamlined and efficient experience.

🆕 New Backends Introduced

  • bark.cpp Backend: Transform text into realistic audio using Bark, a transformer-based text-to-audio model. Install it easily with:

    local-ai models install bark-cpp-small

    Or start it directly:

    local-ai run bark-cpp-small
  • stablediffusion.cpp Backend: Create high-quality images from textual descriptions using the Stable Diffusion backend, now leveraging the ggml framework.

  • Voice Activity Detection with silero-vad: Introducing support for accurate speech segment detection in audio streams. Install via:

    local-ai models install silero-vad

Or configure it through the WebUI.

🔒 WebUI Access with API Keys

The WebUI now supports API key authentication. If one or more API Keys are configured, the WebUI will automatically display a page to authenticate with.

🏆 Enhancements and Features

  • Real-Time Token Usage: Monitor token consumption dynamically during streamed outputs. This feature helps optimize performance and manage costs effectively.
  • P2P Configuration: New settings for advanced peer-to-peer mode:
    • listen_maddrs: Define specific multiaddresses for your node.
    • dht_announce_maddrs: Specify addresses to announce to the DHT network.
    • bootstrap_peers: Set custom bootstrap peers for initial connectivity.
      These options offer more control, especially in constrained networks or custom P2P environments.

🖼️ New Models in the Gallery

We've significantly expanded our model gallery with a variety of new models to cater to diverse AI applications. Among these:

  • Calme-3 Qwen2.5 Series: Enhanced language models offering improved understanding and generation capabilities.
  • Mistral-Nemo-Prism-12b: A powerful model designed for complex language tasks.
  • Llama 3.1 and 3.2 Series: Upgraded versions of the Llama models with better performance and accuracy.
  • Qwen2.5-Coder Series: Specialized models optimized for code generation and programming language understanding.
  • Rombos-Coder Series: Advanced coder models for sophisticated code-related tasks.
  • Silero-VAD: High-quality voice activity detection model for audio processing applications.
  • Bark-Cpp-Small: Lightweight audio generation model suitable for quick and efficient audio synthesis.

Explore these models and more in our updated model gallery to find the perfect fit for your project needs.

🐞 Bug Fixes and Improvements

  • Performance Enhancements: Resolved issues with AVX flags and optimized binaries for accelerated performance, especially on macOS systems.
  • Dependency Updates: Upgraded various dependencies to ensure compatibility, security, and performance improvements across the board.
  • Parsing Corrections: Fixed parsing issues related to maddr and ExtraLLamaCPPArgs in P2P configurations.

📚 Documentation and Examples

  • Updated Guides: Refreshed documentation with new configuration examples, making it easier to get started and integrate the latest features.

📥 How to Upgrade

To upgrade to LocalAI v2.24.0:

  • Download the Latest Release: Get the binaries from our GitHub Releases page.
  • Update Docker Image: Pull the latest Docker image using:
docker pull localai/localai:latest

See also the Documentation at: https://localai.io/basics/container/#standard-container-images

Happy hacking!

What's Changed

Breaking Changes 🛠

Bug fixes 🐛

  • fix(hipblas): disable avx flags when accellerated bins are used by @mudler in #4167
  • chore(deps): bump sycl intel image by @mudler in #4201
  • fix(go.mod): add urfave/cli v2 by @mudler in #4206
  • chore(go.mod): add valyala/fasttemplate by @mudler in #4207
  • fix(p2p): parse maddr correctly by @mudler in #4219
  • fix(p2p): parse correctly ExtraLLamaCPPArgs by @mudler in #4220
  • fix(llama.cpp): embed metal file into result binary for darwin by @mudler in #4279

Exciting New Features 🎉

  • feat: add WebUI API token authorization by @mintyleaf in #4197
  • feat(p2p): add support for configuration of edgevpn listen_maddrs, dht_announce_maddrs and bootstrap_peers by @mintyleaf in #4200
  • feat(silero): add Silero-vad backend by @mudler in #4204
  • feat: include tokens usage for streamed output by @mintyleaf in #4282
  • feat(bark-cpp): add new bark.cpp backend by @mudler in #4287
  • feat(backend): add stablediffusion-ggml by @mudler in #4289

🧠 Models

  • models(gallery): add calme-3 qwen2.5 series by @mudler in #4107
  • models(gallery): add calme-3 qwenloi series by @mudler in #4108
  • models(gallery): add calme-3 llamaloi series by @mudler in #4109
  • models(gallery): add mn-tiramisu-12b by @mudler in #4110
  • models(gallery): add qwen2.5-coder-14b by @mudler in #4125
  • models(gallery): add qwen2.5-coder-3b-instruct by @mudler in #4126
  • models(gallery): add qwen2.5-coder-32b-instruct by @mudler in #4127
  • models(gallery): add qwen2.5-coder-14b-instruct by @mudler in #4128
  • models(gallery): add qwen2.5-coder-1.5b-instruct by @mudler in #4129
  • models(gallery): add qwen2.5-coder-7b-instruct by @mudler in #4130
  • models(gallery): add qwen2.5-coder-7b-3x-instruct-ties-v1.2-i1 by @mudler in #4131
  • models(gallery): add qwen2.5-coder-7b-instruct-abliterated-i1 by @mudler in #4132
  • models(gallery): add rombos-coder-v2.5-qwen-7b by @mudler in #4133
  • models(gallery): add rombos-coder-v2.5-qwen-32b by @mudler in #4134
  • models(gallery): add rombos-coder-v2.5-qwen-14b by @mudler in #4135
  • models(gallery): add eva-qwen2.5-72b-v0.1-i1 by @mudler in #4136
  • models(gallery): add mistral-nemo-prism-12b by @mudler in #4141
  • models(gallery): add tess-3-llama-3.1-70b by @mudler in #4143
  • models(gallery): add celestial-harmony-14b-v1.0-experimental-1016-i1 by @mudler in #4145
  • models(gallery): add llama3.1-8b-enigma by @mudler in #4146
  • chore(model): add llama3.1-8b-cobalt to the gallery by @mudler in #4147
  • chore(model): add qwen2.5-32b-arliai-rpmax-v1.3 to the gallery by @mudler in #4148
  • chore(model): add llama3.2-3b-enigma to the gallery by @mudler in #4149
  • chore(model): add llama-3.1-8b-arliai-rpmax-v1.3 to the gallery by @mudler in #4150
  • chore(model): add magnum-12b-v2.5-kto-i1 to the gallery by @mudler in #4151
  • chore(model): add l3.1-8b-slush-i1 to the gallery by @mudler in #4152
  • models(gallery): add q2.5-ms-mistoria-72b-i1 by @mudler in #4158
  • chore(model): add l3.1-ms-astoria-70b-v2 to the gallery by @mudler in #4159
  • chore(model): add magnum-v2-4b-i1 to the gallery by @mudler in #4160
  • chore(model): add athene-v2-agent to the gallery by @mudler in #4161
  • chore(model): add athene-v2-chat to the gallery by @mudler in #4162
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #4165
  • chore(model): add qwen2.5-7b-nerd-uncensor...
Read more

v2.23.0

10 Nov 17:07
9099d0c
Compare
Choose a tag to compare

What's Changed

Breaking Changes 🛠

  • feat(templates): use a single template for multimodals messages by @mudler in #3892

Bug fixes 🐛

  • fix(parler-tts): use latest audiotools by @mudler in #3954
  • fix(parler-tts): pin grpcio-tools by @mudler in #3960
  • fix(gallery): overrides for parler-tts in the gallery by @mudler in #3962
  • fix(grpc): pass by modelpath by @mudler in #4023
  • fix(tts): correctly pass backend config when generating model options by @mudler in #4091

Exciting New Features 🎉

  • feat(vllm): expose 'load_format' by @mudler in #3943
  • feat(tts): Implement naive response_format for tts endpoint by @n-Arno in #4035
  • feat(diffusers): allow multiple lora adapters by @mudler in #4081
  • feat(ui): move model detailed info to a modal by @mudler in #4086
  • feat: allow to disable '/metrics' endpoints for local stats by @mudler in #3945
  • feat: add flux single file support by @sozercan in #3959

🧠 Models

  • fix(phi3-vision): add multimodal template by @mudler in #3944
  • models(gallery): add l3.1-moe-2x8b-v0.2 by @mudler in #3969
  • models(gallery): add llama3.1-darkstorm-aspire-8b by @mudler in #3970
  • models(gallery): add llama-3.2-sun-2.5b-chat by @mudler in #3971
  • models(gallery): add darkest-muse-v1 by @mudler in #3973
  • models(gallery): add llama-3.2-3b-instruct-uncensored by @mudler in #3974
  • models(gallery): add thebeagle-v2beta-32b-mgs by @mudler in #3975
  • models(gallery): add l3.1-70blivion-v0.1-rc1-70b-i1 by @mudler in #3977
  • models(gallery): add llama-3.1-hawkish-8b by @mudler in #3978
  • models(gallery): add quill-v1 by @mudler in #3980
  • models(gallery): add delirium-v1 by @mudler in #3981
  • models(gallery): add magnum-v4-9b by @mudler in #3983
  • models(gallery): add llama-3-whiterabbitneo-8b-v2.0 by @mudler in #3984
  • models(gallery): add l3-nymeria-maid-8b by @mudler in #3985
  • models(gallery): add meraj-mini by @mudler in #3987
  • models(gallery): add granite-3.0-1b-a400m-instruct by @mudler in #3994
  • models(gallery): add moe-girl-800ma-3bt by @mudler in #3995
  • models(gallery): add spiral-da-hyah-qwen2.5-72b-i1 by @mudler in #4022
  • models(gallery): add llama3.1-bestmix-chem-einstein-8b by @mudler in #4028
  • models(gallery): add starcannon-unleashed-12b-v1.0 by @mudler in #4032
  • models(gallery): add smollm2-1.7b-instruct by @mudler in #4033
  • models(gallery): add control-8b-v1.1 by @mudler in #4039
  • models(gallery): add whiterabbitneo-2.5-qwen-2.5-coder-7b by @mudler in #4042
  • models(gallery): add llama-3.1-whiterabbitneo-2-8b by @mudler in #4043
  • models(gallery): add g2-9b-aletheia-v1 by @mudler in #4056
  • models(gallery): add cybertron-v4-qw7b-mgs by @mudler in #4063
  • models(gallery): add g2-9b-sugarquill-v0 by @mudler in #4073
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #4080
  • models(gallery): add q25-1.5b-veolu by @mudler in #4088
  • models(gallery): add valor-7b-v0.1 by @mudler in #4089
  • models(gallery): add tess-r1-limerick-llama-3.1-70b by @mudler in #4095
  • models(gallery): add llenn-v0.75-qwen2.5-72b-i1 by @mudler in #4098
  • models(gallery): add eva-qwen2.5-14b-v0.2 by @mudler in #4099
  • models(gallery): add opencoder-8b instruct and base by @mudler in #4101
  • models(gallery): add opencoder-1.5b instruct and base by @mudler in #4102
  • models(gallery): add tissint-14b-128k-rp by @mudler in #4103
  • models(gallery): add tq2.5-14b-sugarquill-v1 by @mudler in #4104

📖 Documentation and examples

👒 Dependencies

  • chore(deps): Bump llama-index from 0.11.17 to 0.11.19 in /examples/chainlit by @dependabot in #3893
  • chore(deps): Bump weaviate-client from 4.8.1 to 4.9.0 in /examples/chainlit by @dependabot in #3894
  • chore(deps): Bump langchain from 0.3.3 to 0.3.4 in /examples/functions by @dependabot in #3900
  • chore(deps): Bump langchain-community from 0.3.2 to 0.3.3 in /examples/langchain/langchainpy-localai-example by @dependabot in #3923
  • chore(deps): Bump docs/themes/hugo-theme-relearn from 007cc20 to 06e70da by @dependabot in #3932
  • chore(deps): Bump sqlalchemy from 2.0.35 to 2.0.36 in /examples/langchain/langchainpy-localai-example by @dependabot in #3920
  • chore(deps): Bump yarl from 1.15.2 to 1.15.5 in /examples/langchain/langchainpy-localai-example by @dependabot in #3921
  • chore(deps): Bump openai from 1.51.2 to 1.52.0 in /examples/langchain-chroma by @dependabot in #3908
  • chore(deps): Bump llama-index from 0.11.17 to 0.11.19 in /examples/langchain-chroma by @dependabot in #3907
  • chore(deps): Bump marshmallow from 3.22.0 to 3.23.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3917
  • chore(deps): Bump openai from 1.51.2 to 1.52.0 in /examples/functions by @dependabot in #3901
  • chore(deps): Bump yarl from 1.15.5 to 1.16.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3938
  • chore(deps): Bump openai from 1.51.2 to 1.52.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3993
  • chore(deps): Bump torchvision from 0.18.1+rocm6.0 to 0.20.0+cu118 in /backend/python/diffusers by @dependabot in #3997
  • chore(deps): Bump docs/themes/hugo-theme-relearn from 06e70da to 28fce6b by @dependabot in #3986
  • chore(deps): Bump llama-index from 0.11.19 to 0.11.20 in /examples/langchain-chroma by @dependabot in #3988
  • chore(deps): Bump openai from 1.52.0 to 1.52.2 in /examples/langchain-chroma by @dependabot in #3989
  • chore(deps): Bump llama-index from 0.11.19 to 0.11.20 in /examples/chainlit by @dependabot in #3990
  • chore(deps): Bump tqdm from 4.66.5 to 4.66.6 in /examples/langchain/langchainpy-localai-example by @dependabot in #3991
  • chore(deps): Bump frozenlist from 1.4.1 to 1.5.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3992
  • chore(deps): Bump openai from 1.52.0 to 1.52.2 in /examples/functions by @dependabot in #4000
  • chore(deps): bump grpcio to 1.67.1 by @mudler in #4009
  • chore(deps): bump llama-cpp to 8f275a7c4593aa34147595a90282cf950a853690 by @mudler in #4016

Other Changes

  • docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3936
  • chore: ⬆️ Update ggerganov/llama.cpp to e01c67affe450638162a1a457e2e57859ef6ebf0 by @localai-bot in #3937
  • chore: update go-piper to latest by @dave-gray101 in #3939
  • chore: ⬆️ Update ggerganov/llama.cpp to c8c07d658a6cefc5a50cfdf6be7d726503612303 by @localai-bot in #3940
  • chore: ⬆️ Update ggerganov/whisper.cpp to 0fbaac9c891055796456df7b9122a70c220f9ca1 by @localai-bot in #3950
  • chore: ⬆️ Update ggerganov/llama.cpp to 0a1c750c80147687df267114c81956757cc14382 by @localai-bot in #3948
  • chore...
Read more

v2.22.1

21 Oct 12:50
015835d
Compare
Choose a tag to compare

What's Changed

Bug fixes 🐛

  • fix(vllm): images and videos are base64 by default by @mudler in #3867
  • fix(dependencies): pin pytorch version by @mudler in #3872
  • fix(dependencies): move deps that brings pytorch by @mudler in #3873
  • fix(vllm): do not set videos if we don't have any by @mudler in #3885

Exciting New Features 🎉

  • feat(templates): extract text from multimodal requests by @mudler in #3866
  • feat(templates): add sprig to multimodal templates by @mudler in #3868

🧠 Models

  • models(gallery): add llama-3_8b_unaligned_beta by @mudler in #3818
  • models(gallery): add llama3.1-flammades-70b by @mudler in #3819
  • models(gallery): add llama3.1-gutenberg-doppel-70b by @mudler in #3820
  • models(gallery): add llama-3.1-8b-arliai-formax-v1.0-iq-arm-imatrix by @mudler in #3821
  • models(gallery): add supernova-medius by @mudler in #3822
  • models(gallery): add hermes-3-llama-3.1-70b-lorablated by @mudler in #3823
  • models(gallery): add hermes-3-llama-3.1-8b-lorablated by @mudler in #3824
  • models(gallery): add eva-qwen2.5-14b-v0.1-i1 by @mudler in #3825
  • models(gallery): add cursorcore-qw2.5-7b-i1 by @mudler in #3826
  • models(gallery): add cursorcore-qw2.5-1.5b-lc-i1 by @mudler in #3827
  • models(gallery): add cursorcore-ds-6.7b-i1 by @mudler in #3828
  • models(gallery): add cursorcore-yi-9b by @mudler in #3829
  • models(gallery): add edgerunner-command-nested-i1 by @mudler in #3830
  • models(gallery): add llama-3.2-chibi-3b by @mudler in #3843
  • models(gallery): add llama-3.2-3b-reasoning-time by @mudler in #3844
  • models(gallery): add ml-ms-etheris-123b by @mudler in #3845
  • models(gallery): add doctoraifinetune-3.1-8b-i1 by @mudler in #3846
  • models(gallery): add astral-fusion-neural-happy-l3.1-8b by @mudler in #3848
  • models(gallery): add tsunami-0.5x-7b-instruct-i1 by @mudler in #3849
  • models(gallery): add mahou-1.5-llama3.1-70b-i1 by @mudler in #3850
  • models(gallery): add llama-3.1-nemotron-70b-instruct-hf by @mudler in #3854
  • models(gallery): add qevacot-7b-v2 by @mudler in #3855
  • models(gallery): add l3.1-etherealrainbow-v1.0-rc1-8b by @mudler in #3856
  • models(gallery): add phi-3.5-mini-titanfusion-0.2 by @mudler in #3857
  • models(gallery): add mn-lulanum-12b-fix-i1 by @mudler in #3859
  • models(gallery): add apollo2-9b by @mudler in #3860
  • models(gallery): add theia-llama-3.1-8b-v1 by @mudler in #3861
  • models(gallery): add tor-8b by @mudler in #3862
  • models(gallery): add darkens-8b by @mudler in #3863
  • models(gallery): add baldur-8b by @mudler in #3864
  • models(gallery): add meissa-qwen2.5-7b-instruct by @mudler in #3865
  • models(gallery): add phi-3 vision by @mudler in #3890

👒 Dependencies

  • chore(deps): Bump docs/themes/hugo-theme-relearn from d5a0ee0 to e1a1f01 by @dependabot in #3798
  • chore(deps): Bump mxschmitt/action-tmate from 3.18 to 3.19 by @dependabot in #3799
  • chore(deps): Bump sentence-transformers from 3.1.1 to 3.2.0 in /backend/python/sentencetransformers by @dependabot in #3801
  • chore(deps): Bump langchain from 0.3.2 to 0.3.3 in /examples/langchain/langchainpy-localai-example by @dependabot in #3803
  • chore(deps): Bump llama-index from 0.11.16 to 0.11.17 in /examples/langchain-chroma by @dependabot in #3804
  • chore(deps): Bump python from 3.12-bullseye to 3.13-bullseye in /examples/langchain by @dependabot in #3805
  • chore(deps): Bump openai from 1.51.1 to 1.51.2 in /examples/functions by @dependabot in #3806
  • chore(deps): Bump llama-index from 0.11.16 to 0.11.17 in /examples/chainlit by @dependabot in #3807
  • chore(deps): Bump langchain from 0.3.1 to 0.3.3 in /examples/langchain-chroma by @dependabot in #3809
  • chore(deps): Bump openai from 1.51.1 to 1.51.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3808
  • chore(deps): Bump yarl from 1.13.1 to 1.15.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3816
  • chore(deps): Bump chromadb from 0.5.11 to 0.5.13 in /examples/langchain-chroma by @dependabot in #3811
  • chore(deps): Bump langchain from 0.3.2 to 0.3.3 in /examples/functions by @dependabot in #3802
  • chore(deps): Bump debugpy from 1.8.6 to 1.8.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #3814
  • chore(deps): Bump aiohttp from 3.10.9 to 3.10.10 in /examples/langchain/langchainpy-localai-example by @dependabot in #3812
  • chore(deps): Bump openai from 1.51.1 to 1.51.2 in /examples/langchain-chroma by @dependabot in #3810
  • chore(deps): Bump charset-normalizer from 3.3.2 to 3.4.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3834
  • chore(deps): Bump langchain-community from 0.3.1 to 0.3.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3831
  • chore(deps): Bump yarl from 1.15.1 to 1.15.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3832
  • chore(deps): Bump numpy from 2.1.1 to 2.1.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3833
  • chore(deps): Bump docs/themes/hugo-theme-relearn from e1a1f01 to 007cc20 by @dependabot in #3835
  • chore(deps): Bump gradio from 3.48.0 to 5.0.0 in /backend/python/openvoice in the pip group by @dependabot in #3880
  • chore(deps): bump llama-cpp to cda0e4b648dde8fac162b3430b14a99597d3d74f by @mudler in #3884

Other Changes

  • docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3796
  • chore: dependabot ignore generated grpc go package by @dave-gray101 in #3795
  • chore: ⬆️ Update ggerganov/llama.cpp to edc265661cd707327297b6ec4d83423c43cb50a5 by @localai-bot in #3797
  • chore: ⬆️ Update ggerganov/llama.cpp to d4c19c0f5cdb1e512573e8c86c79e8d0238c73c4 by @localai-bot in #3817
  • chore: ⬆️ Update ggerganov/llama.cpp to a89f75e1b7b90cb2d4d4c52ca53ef9e9b466aa45 by @localai-bot in #3837
  • chore: ⬆️ Update ggerganov/whisper.cpp to 06a1da9daff94c1bf1b1d38950628264fe443f76 by @localai-bot in #3836
  • Update integrations.md with LLPhant by @f-lombardo in #3838
  • fix(llama.cpp): consider also native builds by @mudler in #3839
  • chore: ⬆️ Update ggerganov/whisper.cpp to b6049060dd2341b7816d2bce7dc7451c1665828e by @localai-bot in #3842
  • chore: ⬆️ Update ggerganov/llama.cpp to 755a9b2bf00fbae988e03a47e852b66eaddd113a by @localai-bot in #3841
  • chore(deps): bump grpcio to 1.67.0 by @mudler in #3851
  • chore: ⬆️ Update ggerganov/llama.cpp to 9e041024481f6b249ab8918e18b9477f873b5a5e by @localai-bot in #3853
  • chore: ⬆️ Update ggerganov/whisper.cpp to d3f7137cc9befa6d74dc4085de2b664b97b7c8bb by @localai-bot in #3852
  • fix(mamba): pin torch version by @mudler in #3871
  • chore: ⬆️ Update ggerganov/llama.cpp to 99bd4ac28c32cd17c0e337ff5601393b033dc5fc by @localai-bot in #3869
  • chore: ⬆️ Update ggerganov/whisper.cpp to a5abfe6a90495f7bf19fe70d016ecc255e97359c by @localai-bot in #3870
  • chore(deps): pin packaging by @mudler ...
Read more

v2.22.0

12 Oct 13:09
a1634b2
Compare
Choose a tag to compare

LocalAI v2.22.0 is out 🥳

💡 Highlights

  • Image-to-Text and Video-to-Text Support: The VLLM backend now supports both image-to-text and video-to-text processing.
  • Enhanced Multimodal Support: Template placeholders are now available, offering more flexibility in multimodal applications
  • Model Management Made Easy: List all your loaded models directly via the /system endpoint for seamless management.
  • Various bugfixes and improvements: Fixed issues with dangling processes to ensure proper resource management and resolved channel closure issues in the base GRPC server.

🖼️ Multimodal vLLM

To use multimodal models with vLLM simply specify the model in the YAML file. Models however can differ if support multiple images or single images, along how they process internally placeholders for images.

Some models/libraries have different way to express images, videos or audio placeholders. For example, llama.cpp backend expects images within an [img-ID] tag, but other backends/models (e.g. vLLM) use a different notation ( <|image_|>).

For example, to override defaults, now it is possible to set in the model configuration the following:

template:
  video: "<|video_{{.ID}}|> {{.Text}}"
  image: "<|image_{{.ID}}|> {{.Text}}"
  audio: "<|audio_{{.ID}}|> {{.Text}}"

📹 Video and Audio understanding

Some libraries might support both Video and Audio. Currently only vLLM supports Video understanding, and can be used in the API by "extending" the OpenAI API with audio and video type along images:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What'\''s in this video?"
          },
          {
            "type": "video_url",
            "video_url": {
              "url": "https://video-image-url"
            }
          }
        ]
      }
    ],
    "max_tokens": 300
  }'

🧑‍🏭 Work in progress

  • Realtime API is work in progress , tracked in #3714. thumbs up if you want to see it supported in LocalAI!

What's Changed

Bug fixes 🐛

  • chore: simplify model loading by @mudler in #3715
  • fix(initializer): correctly reap dangling processes by @mudler in #3717
  • fix(base-grpc): close channel in base grpc server by @mudler in #3734
  • fix(vllm): bump cmake - vllm requires it by @mudler in #3744
  • fix(llama-cpp): consistently select fallback by @mudler in #3789
  • fix(welcome): do not list model twice if we have a config by @mudler in #3790
  • fix: listmodelservice / welcome endpoint use LOOSE_ONLY by @dave-gray101 in #3791

Exciting New Features 🎉

  • feat(api): list loaded models in /system by @mudler in #3661
  • feat: Add Get Token Metrics to GRPC server by @siddimore in #3687
  • refactor: ListModels Filtering Upgrade by @dave-gray101 in #2773
  • feat: track internally started models by ID by @mudler in #3693
  • feat: tokenization endpoint by @shraddhazpy in #3710
  • feat(multimodal): allow to template placeholders by @mudler in #3728
  • feat(vllm): add support for image-to-text and video-to-text by @mudler in #3729
  • feat(shutdown): allow force shutdown of backends by @mudler in #3733
  • feat(transformers): Use downloaded model for Transformers backend if it already exists. by @joshbtn in #3777
  • fix: roll out bluemonday Sanitize more widely by @dave-gray101 in #3794

🧠 Models

  • models(gallery): add llama-3.2 3B and 1B by @mudler in #3671
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #3675
  • models(gallery): add magnusintellectus-12b-v1-i1 by @mudler in #3678
  • models(gallery): add bigqwen2.5-52b-instruct by @mudler in #3679
  • feat(api): add correlationID to Track Chat requests by @siddimore in #3668
  • models(gallery): add replete-llm-v2.5-qwen-14b by @mudler in #3688
  • models(gallery): add replete-llm-v2.5-qwen-7b by @mudler in #3689
  • models(gallery): add calme-2.2-qwen2.5-72b-i1 by @mudler in #3691
  • models(gallery): add salamandra-7b-instruct by @mudler in #3726
  • models(gallery): add mn-backyardai-party-12b-v1-iq-arm-imatrix by @mudler in #3740
  • models(gallery): add t.e-8.1-iq-imatrix-request by @mudler in #3741
  • models(gallery): add violet_twilight-v0.2-iq-imatrix by @mudler in #3742
  • models(gallery): add gemma-2-9b-it-abliterated by @mudler in #3743
  • models(gallery): add moe-girl-1ba-7bt-i1 by @mudler in #3766
  • models(gallery): add archfunctions models by @mudler in #3767
  • models(gallery): add versatillama-llama-3.2-3b-instruct-abliterated by @mudler in #3771
  • models(gallery): add llama3.2-3b-enigma by @mudler in #3772
  • models(gallery): add llama3.2-3b-esper2 by @mudler in #3773
  • models(gallery): add llama-3.1-swallow-70b-v0.1-i1 by @mudler in #3774
  • models(gallery): add rombos-llm-v2.5.1-qwen-3b by @mudler in #3778
  • models(gallery): add qwen2.5-7b-ins-v3 by @mudler in #3779
  • models(gallery): add dans-personalityengine-v1.0.0-8b by @mudler in #3780
  • models(gallery): add llama-3.2-3b-agent007 by @mudler in #3781
  • models(gallery): add nihappy-l3.1-8b-v0.09 by @mudler in #3782
  • models(gallery): add llama-3.2-3b-agent007-coder by @mudler in #3783
  • models(gallery): add fireball-meta-llama-3.2-8b-instruct-agent-003-128k-code-dpo by @mudler in #3784
  • models(gallery): add gemma-2-ataraxy-v3i-9b by @mudler in #3785

📖 Documentation and examples

👒 Dependencies

  • chore: ⬆️ Update ggerganov/llama.cpp to ea9c32be71b91b42ecc538bd902e93cbb5fb36cb by @localai-bot in #3667
  • chore: ⬆️ Update ggerganov/whisper.cpp to 69339af2d104802f3f201fd419163defba52890e by @localai-bot in #3666
  • chore: ⬆️ Update ggerganov/llama.cpp to 95bc82fbc0df6d48cf66c857a4dda3d044f45ca2 by @localai-bot in #3674
  • chore: ⬆️ Update ggerganov/llama.cpp to b5de3b74a595cbfefab7eeb5a567425c6a9690cf by @localai-bot in #3681
  • chore: ⬆️ Update ggerganov/whisper.cpp to 8feb375fbdf0277ad36958c218c6bf48fa0ba75a by @localai-bot in #3680
  • chore: ⬆️ Update ggerganov/llama.cpp to c919d5db39c8a7fcb64737f008e4b105ee0acd20 by @localai-bot in #3686
  • chore(deps): bump grpcio to 1.66.2 by @mudler in #3690
  • chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/langchain-chroma by @dependabot in #3697
  • chore(deps): Bump chromadb from 0.5.7 to 0.5.11 in /examples/langchain-chroma by @dependabot in #3696
  • chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain-chroma by @dependabot in #3694
  • chore: ⬆️ Update ggerganov/llama.cpp to 6f1d9d71f4c568778a7637ff6582e6f6ba5fb9d3 by @localai-bot in #3708
  • chore(deps): Bump securego/gosec from 2.21.0 to 2.21.4 by @dependabot in #3698
  • chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/functions by @dependabot in #3699
  • chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3704
  • chore(deps): Bump greenlet from 3.1.0 to 3.1.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3703
  • chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/functions by @dependabot in #3700
  • chore(deps): Bump langchain-community from 0.2.16 to 0.3.1 in /examples/langchain/langchainpy-localai-example by ...
Read more

v2.21.1

25 Sep 12:30
33b2d38
Compare
Choose a tag to compare

What's Changed

Bug fixes 🐛

  • fix(health): do not require auth for /healthz and /readyz by @mudler in #3656

👒 Dependencies

  • chore(deps): Bump sentence-transformers from 3.1.0 to 3.1.1 in /backend/python/sentencetransformers by @dependabot in #3651
  • chore(deps): Bump pydantic from 2.8.2 to 2.9.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3648
  • chore(deps): Bump openai from 1.45.1 to 1.47.1 in /examples/functions by @dependabot in #3645
  • chore: ⬆️ Update ggerganov/llama.cpp to 70392f1f81470607ba3afef04aa56c9f65587664 by @localai-bot in #3659
  • chore(deps): Bump llama-index from 0.11.7 to 0.11.12 in /examples/langchain-chroma by @dependabot in #3639
  • chore(deps): Bump openai from 1.45.1 to 1.47.1 in /examples/langchain-chroma by @dependabot in #3641
  • chore(deps): Bump llama-index from 0.11.9 to 0.11.12 in /examples/chainlit by @dependabot in #3642
  • chore: ⬆️ Update ggerganov/whisper.cpp to 0d2e2aed80109e8696791083bde3b58e190b7812 by @localai-bot in #3658
  • chore(deps): Bump chromadb from 0.5.5 to 0.5.7 in /examples/langchain-chroma by @dependabot in #3640

Other Changes

Full Changelog: v2.21.0...v2.21.1

v2.21.0

24 Sep 14:22
90cacb9
Compare
Choose a tag to compare

💡 Highlights!

LocalAI v2.21 release is out!

  • Deprecation of the exllama backend
  • AIO images now have gpt-4o instead of gpt-4-vision-preview for Vision API
  • vLLM backend now supports embeddings
  • New endpoint to list system information (/system)
  • trust_remote_code is now respected by sentencetransformers
  • Auto warm-up and load models on start
  • coqui backend switched to the community-maintained fork

What's Changed

Breaking Changes 🛠

  • chore(exllama): drop exllama backend by @mudler in #3536
  • chore(aio): rename gpt-4-vision-preview to gpt-4o by @mudler in #3597

Exciting New Features 🎉

  • feat: elevenlabs sound-generation api by @dave-gray101 in #3355
  • feat(vllm): add support for embeddings by @mudler in #3440
  • feat: add endpoint to list system informations by @mudler in #3449
  • feat: extract output with regexes from LLMs by @mudler in #3491
  • feat: allow setting trust_remote_code for sentencetransformers backend by @Nyralei in #3552
  • feat(api): allow to pass videos to backends by @mudler in #3601
  • feat(api): allow to pass audios to backends by @mudler in #3603
  • feat: auto load into memory on startup by @sozercan in #3627
  • feat(coqui): switch to maintained community fork by @mudler in #3625

Bug fixes 🐛

  • fix(p2p): correctly allow to pass extra args to llama.cpp by @mudler in #3368
  • fix(model-loading): keep track of open GRPC Clients by @mudler in #3377
  • fix(tts): check error before inspecting result by @mudler in #3415
  • fix(shutdown): do not shutdown immediately busy backends by @mudler in #3543
  • fix(parler-tts): fix install with sycl by @mudler in #3624
  • fix(ci): fixup checksum scanning pipeline by @mudler in #3631
  • fix(hipblas): do not push all variants to hipblas builds by @mudler in #3630

🧠 Models

  • chore(model-gallery): add more quants for popular models by @mudler in #3365
  • models(gallery): add phi-3.5 by @mudler in #3376
  • models(gallery): add calme-2.1-phi3.5-4b-i1 by @mudler in #3383
  • models(gallery): add magnum-v3-34b by @mudler in #3384
  • models(gallery): add phi-3.5-vision by @mudler in #3421
  • Revert "models(gallery): add phi-3.5-vision" by @mudler in #3422
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #3425
  • feat: Added Piper voice it-paola-medium by @fakezeta in #3434
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #3442
  • models(gallery): add hubble-4b-v1 by @mudler in #3444
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #3446
  • models(gallery): add yi-coder (and variants) by @mudler in #3482
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #3486
  • models(gallery): add reflection-llama-3.1-70b by @mudler in #3487
  • models(gallery): add athena-codegemma-2-2b-it by @mudler in #3490
  • models(gallery): add azure_dusk-v0.2-iq-imatrix by @mudler in #3538
  • models(gallery): add mn-12b-lyra-v4-iq-imatrix by @mudler in #3539
  • models(gallery): add datagemma models by @mudler in #3540
  • models(gallery): add l3.1-8b-niitama-v1.1-iq-imatrix by @mudler in #3550
  • models(gallery): add llama-3.1-8b-stheno-v3.4-iq-imatrix by @mudler in #3551
  • fix: gallery/index.yaml comment spacing by @dave-gray101 in #3585
  • models(gallery): add qwen2.5-14b-instruct by @mudler in #3607
  • models(gallery): add qwen2.5-math-7b-instruct by @mudler in #3609
  • models(gallery): add qwen2.5-14b_uncencored by @mudler in #3610
  • models(gallery): add qwen2.5-coder-7b-instruct by @mudler in #3611
  • models(gallery): add qwen2.5-math-72b-instruct by @mudler in #3612
  • models(gallery): add qwen2.5-0.5b-instruct, qwen2.5-1.5b-instruct by @mudler in #3613
  • models(gallery): add qwen2.5 32B, 72B, 32B Instruct by @mudler in #3614
  • models(gallery): add llama-3.1-supernova-lite-reflection-v1.0-i1 by @mudler in #3615
  • models(gallery): add llama-3.1-supernova-lite by @mudler in #3616
  • models(gallery): add llama3.1-8b-shiningvaliant2 by @mudler in #3617
  • models(gallery): add buddy2 by @mudler in #3618
  • models(gallery): add llama-3.1-8b-arliai-rpmax-v1.1 by @mudler in #3619
  • Fix NeuralDaredevil URL by @nyx4ris in #3621
  • models(gallery): add nightygurps-14b-v1.1 by @mudler in #3633
  • models(gallery): add gemma-2-9b-arliai-rpmax-v1.1 by @mudler in #3634
  • models(gallery): add gemma-2-2b-arliai-rpmax-v1.1 by @mudler in #3635
  • models(gallery): add acolyte-22b-i1 by @mudler in #3636

📖 Documentation and examples

👒 Dependencies

  • chore: ⬆️ Update ggerganov/llama.cpp to 3ba780e2a8f0ffe13f571b27f0bbf2ca5a199efc by @localai-bot in #3361
  • chore(deps): Bump openai from 1.41.1 to 1.42.0 in /examples/functions by @dependabot in #3390
  • chore(deps): Bump docs/themes/hugo-theme-relearn from 82a5e98 to 3a0ae52 by @dependabot in #3391
  • chore(deps): Bump idna from 3.7 to 3.8 in /examples/langchain/langchainpy-localai-example by @dependabot in #3399
  • chore(deps): Bump llama-index from 0.10.65 to 0.11.1 in /examples/chainlit by @dependabot in #3404
  • chore(deps): Bump llama-index from 0.10.67.post1 to 0.11.1 in /examples/langchain-chroma by @dependabot in #3406
  • chore(deps): Bump marshmallow from 3.21.3 to 3.22.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3400
  • chore(deps): Bump openai from 1.40.5 to 1.42.0 in /examples/langchain-chroma by @dependabot in #3405
  • chore(deps): Bump openai from 1.41.1 to 1.42.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3401
  • chore(deps): update edgevpn to v0.28 by @mudler in #3412
  • chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/functions by @dependabot in #3453
  • chore(deps): Bump certifi from 2024.7.4 to 2024.8.30 in /examples/langchain/langchainpy-localai-example by @dependabot in #3457
  • chore(deps): Bump yarl from 1.9.4 to 1.9.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #3459
  • chore(deps): Bump langchain-community from 0.2.12 to 0.2.15 in /examples/langchain/langchainpy-localai-example by @dependabot in #3461
  • chore(deps): Bump llama-index from 0.11.1 to 0.11.4 in /examples/chainlit by @dependabot in #3462
  • chore(deps): Bump llama-index from 0.11.1 to 0.11.4 in /examples/langchain-chroma by @dependabot in #3467
  • chore(deps): Bump docs/themes/hugo-theme-relearn from 3a0ae52 to 550a6ee by @dependabot in #3472
  • chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/functions by @dependabot in #3452
  • chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/langchain/langchainpy-localai-example by @dependabot in #3460
  • chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/langchain-chroma by @dependabot in #3468
  • chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/langchain-chroma by ...
Read more