v2.20.0
TL;DR
- π Explorer & Community: Explore global community pools at explorer.localai.io
- π Demo instance available: Test out LocalAI at demo.localai.io
- π€ Integration: Hugging Face Local apps now include LocalAI
- π Bug Fixes: Diffusers and hipblas issues resolved
- π¨ New Feature: FLUX-1 image generation support
- ποΈ Strict Mode: Stay compliant with OpenAIβs latest API changes
- πͺ Multiple P2P Clusters: Run multiple clusters within the same network
- π§ͺ Deprecation Notice:
gpt4all.cpp
andpetals
backends deprecated
π Explorer and Global Community Pools
Now you can share your LocalAI instance with the global community or explore available instances by visiting explorer.localai.io. This decentralized network powers our demo instance, creating a truly collaborative AI experience.
How It Works
Using the Explorer, you can easily share or connect to clusters. For detailed instructions on creating new clusters or connecting to existing ones, check out our documentation.
π Demo Instance Now Available
Curious about what LocalAI can do? Dive right in with our live demo at demo.localai.io! Thanks to our generous sponsors, this instance is publicly available and configured via peer-to-peer (P2P) networks. If you'd like to connect, follow the instructions here.
π€ Hugging Face Integration
I am excited to announce that LocalAI is now integrated within Hugging Faceβs local apps! This means you can select LocalAI directly within Hugging Face to build and deploy models with the power and flexibility of our platform. Experience seamless integration with a single click!
This integration was made possible through this PR.
π¨ FLUX-1 Image Generation Support
FLUX-1 lands in LocalAI! With this update, LocalAI can now generate stunning images using FLUX-1, even in federated mode. Whether you're experimenting with new designs or creating production-quality visuals, FLUX-1 has you covered.
Try it out at demo.localai.io and see what LocalAI + FLUX-1 can do!
π Diffusers and hipblas Fixes
Great news for AMD users! If youβve encountered issues with the Diffusers backend or hipblas, those bugs have been resolved. Weβve transitioned to uv
for managing Python dependencies, ensuring a smoother experience. For more details, check out Issue #1592.
ποΈ Strict Mode for API Compliance
To stay up to date with OpenAIβs latest changes, now LocalAI have support as well for Strict Mode ( https://openai.com/index/introducing-structured-outputs-in-the-api/ ). This new feature ensures compatibility with the most recent API updates, enforcing stricter JSON outputs using BNF grammar rules.
To activate, simply set strict: true
in your API calls, even if itβs disabled in your configuration.
Key Notes:
- Setting
strict: true
enables grammar enforcement, even if disabled in your config. - If
format_type
is set tojson_schema
, BNF grammars will be automatically generated from the schema.
π Disable Gallery
Need to streamline your setup? You can now disable the gallery endpoint using LOCALAI_DISABLE_GALLERY_ENDPOINT
. For more options, check out the full list of commands with --help
.
π P2P and Federation Enhancements
Several enhancements have been made to improve your experience with P2P and federated clusters:
- Load Balancing by Default: This feature is now enabled by default (disable it with
LOCALAI_RANDOM_WORKER
if needed). - Target Specific Workers: Directly target workers in federated mode using
LOCALAI_TARGET_WORKER
.
πͺ Run Multiple P2P Clusters in the Same Network
You can now run multiple clusters within the same network by specifying a network ID via CLI. This allows you to logically separate clusters while using the same shared token. Just set LOCALAI_P2P_NETWORK_ID
to a UUID that matches across instances.
Please note, while this offers segmentation, itβs not fully secureβanyone with the network token can view available services within the network.
π§ͺ Deprecation Notice: gpt4all.cpp
and petals
Backends
As we continue to evolve, we are officially deprecating the gpt4all.cpp
and petals
backends. The newer llama.cpp
offers a superior set of features and better performance, making it the preferred choice moving forward.
From this release onward, gpt4all
models in ggml
format are no longer compatible. Additionally, the petals
backend has been deprecated as well. LocalAIβs new P2P capabilities now offer a comprehensive replacement for these features.
What's Changed
Breaking Changes π
Bug fixes π
- fix(ui): do not show duplicate entries if not installed by gallery by @mudler in #3107
- fix: be consistent in downloading files, check for scanner errors by @mudler in #3108
- fix: ensure correct version of torch is always installed based on BUI⦠by @cryptk in #2890
- fix(python): move accelerate and GPU-specific libs to build-type by @mudler in #3194
- fix(apple): disable BUILD_TYPE metal on fallback by @mudler in #3199
- fix(vall-e-x): pin hipblas deps by @mudler in #3201
- fix(diffusers): use nightly rocm for hipblas builds by @mudler in #3202
- fix(explorer): reset counter when network is active by @mudler in #3213
- fix(p2p): allocate tunnels only when needed by @mudler in #3259
- fix(gallery): be consistent and disable UI routes as well by @mudler in #3262
- fix(parler-tts): bump and require after build type deps by @mudler in #3272
- fix: add llvm to extra images by @mudler in #3321
- fix(p2p): re-use p2p host when running federated mode by @mudler in #3341
- fix(ci): pin to llvmlite 0.43 by @mudler in #3342
- fix(p2p): avoid starting the node twice by @mudler in #3349
- fix(chat): re-generated uuid, created, and text on each request by @mudler in #3359
Exciting New Features π
- feat(guesser): add gemma2 by @sozercan in #3118
- feat(venv): shared env by @mudler in #3195
- feat(openai): add
json_schema
format type and strict mode by @mudler in #3193 - feat(p2p): allow to run multiple clusters in the same p2p network by @mudler in #3128
- feat(p2p): add network explorer and community pools by @mudler in #3125
- feat(explorer): relax token deletion with error threshold by @mudler in #3211
- feat(diffusers): support flux models by @mudler in #3129
- feat(explorer): make possible to run sync in a separate process by @mudler in #3224
- feat(federated): allow to pickup a specific worker, improve loadbalancing by @mudler in #3243
- feat: Initial Version of vscode DevContainer by @dave-gray101 in #3217
- feat(explorer): visual improvements by @mudler in #3247
- feat(gallery): lazy load images by @mudler in #3246
- chore(explorer): add join instructions by @mudler in #3255
- chore: allow to disable gallery endpoints, improve p2p connection handling by @mudler in #3256
- chore(ux): add animated header with anime.js in p2p sections by @mudler in #3271
- chore(p2p): make commands easier to copy-paste by @mudler in #3273
- chore(ux): allow to create and drag dots in the animation by @mudler in #3287
- feat(federation): do not allocate local services for load balancing by @mudler in #3337
- feat(p2p): allow to set intervals by @mudler in #3353
π§ Models
- models(gallery): add meta-llama-3.1-instruct-9.99b-brainstorm-10x-form-3 by @mudler in #3103
- models(gallery): add mn-12b-celeste-v1.9 by @mudler in #3104
- models(gallery): add shieldgemma by @mudler in #3105
- models(gallery): add llama-3.1-techne-rp-8b-v1 by @mudler in #3112
- models(gallery): add llama-spark by @mudler in #3116
- models(gallery): add glitz by @mudler in #3119
- models(gallery): add gemmasutra-mini by @mudler in #3120
- models(gallery): add kumiho-v1-rp-uwu-8b by @mudler in #3121
- models(gallery): add humanish-roleplay-llama-3.1-8b-i1 by @mudler in #3126
- chore(model-gallery): β¬οΈ update checksum by @localai-bot in #3167
- models(gallery): add calme-2.2-qwen2-72b by @mudler in #3185
- models(gallery): add calme-2.3-legalkit-8b by @mudler in #3200
- chore(model-gallery): β¬οΈ update checksum by @localai-bot in #3210
- models(gallery): add flux.1-dev and flux.1-schnell by @mudler in #3215
- models(gallery): add infinity-instruct-7m-gen-llama3_1-70b by @mudler in #3220
- models(gallery): add cathallama-70b by @mudler in #3221
- models(gallery): add edgerunner-tactical-7b by @mudler in #3249
- models(gallery): add hermes-3 by @mudler in #3252
- models(gallery): add SmolLM by @mudler in #3265
- models(gallery): add mahou-1.3-llama3.1-8b by @mudler in #3266
- models(gallery): add fireball-llama-3.11-8b-v1orpo by @mudler in #3267
- models(gallery): add rocinante-12b-v1.1 by @mudler in #3268
- models(gallery): add pantheon-rp-1.6-12b-nemo by @mudler in #3269
- models(gallery): add llama-3.1-storm-8b-q4_k_m by @mudler in #3270
π Documentation and examples
- docs: β¬οΈ update docs version mudler/LocalAI by @localai-bot in #3109
- fix(docs): Refer to the OpenAI documentation to update the openai-functions docu⦠by @jermeyhu in #3317
- chore(docs): update p2p env var documentation by @mudler in #3350
π Dependencies
- chore: β¬οΈ Update ggerganov/llama.cpp by @localai-bot in #3110
- chore: β¬οΈ Update ggerganov/llama.cpp by @localai-bot in #3115
- chore: β¬οΈ Update ggerganov/llama.cpp by @localai-bot in #3117
- chore: β¬οΈ Update ggerganov/llama.cpp by @localai-bot in #3123
- chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/autogptq by @dependabot in #3130
- chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/common/template by @dependabot in #3131
- chore(deps): Bump langchain from 0.2.10 to 0.2.12 in /examples/functions by @dependabot in #3132
- chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/openvoice by @dependabot in #3137
- chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/coqui by @dependabot in #3138
- chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/transformers-musicgen by @dependabot in #3140
- chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/diffusers by @dependabot in #3141
- chore(deps): Bump llama-index from 0.10.56 to 0.10.59 in /examples/chainlit by @dependabot in #3143
- chore(deps): Bump docs/themes/hugo-theme-relearn from
7aec99b
to8b14837
by @dependabot in #3142 - chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/exllama2 by @dependabot in #3146
- chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/bark by @dependabot in #3144
- chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/rerankers by @dependabot in #3147
- chore(deps): Bump langchain from 0.2.10 to 0.2.12 in /examples/langchain-chroma by @dependabot in #3148
- chore(deps): Bump streamlit from 1.37.0 to 1.37.1 in /examples/streamlit-bot by @dependabot in #3151
- chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/vllm by @dependabot in #3152
- chore(deps): Bump langchain from 0.2.11 to 0.2.12 in /examples/langchain/langchainpy-localai-example by @dependabot in #3155
- chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/transformers by @dependabot in #3161
- chore(deps): Bump grpcio from 1.65.1 to 1.65.4 in /backend/python/vall-e-x by @dependabot in #3156
- chore(deps): Bump sqlalchemy from 2.0.31 to 2.0.32 in /examples/langchain/langchainpy-localai-example by @dependabot in #3157
- chore: β¬οΈ Update ggerganov/whisper.cpp by @localai-bot in #3164
- chore(deps): Bump openai from 1.37.0 to 1.39.0 in /examples/functions by @dependabot in #3134
- chore(deps): Bump openai from 1.37.0 to 1.39.0 in /examples/langchain-chroma by @dependabot in #3149
- chore(deps): Bump openai from 1.37.1 to 1.39.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3158
- chore: β¬οΈ Update ggerganov/llama.cpp by @mudler in #3166
- chore(deps): Bump tqdm from 4.66.4 to 4.66.5 in /examples/langchain/langchainpy-localai-example by @dependabot in #3159
- chore(deps): Bump llama-index from 0.10.56 to 0.10.61 in /examples/langchain-chroma by @dependabot in #3168
- chore: β¬οΈ Update ggerganov/llama.cpp to
1e6f6554aa11fa10160a5fda689e736c3c34169f
by @mudler in #3189 - chore: β¬οΈ Update ggerganov/llama.cpp to
15fa07a5c564d3ed7e7eb64b73272cedb27e73ec
by @localai-bot in #3197 - chore: β¬οΈ Update ggerganov/whisper.cpp to
6eac06759b87b50132a01be019e9250a3ffc8969
by @localai-bot in #3203 - chore: β¬οΈ Update ggerganov/llama.cpp to
3a14e00366399040a139c67dd5951177a8cb5695
by @localai-bot in #3204 - chore(deps): Bump aiohttp from 3.9.5 to 3.10.2 in /examples/langchain/langchainpy-localai-example in the pip group by @dependabot in #3207
- chore: β¬οΈ Update ggerganov/llama.cpp to
b72942fac998672a79a1ae3c03b340f7e629980b
by @localai-bot in #3208 - chore: β¬οΈ Update ggerganov/whisper.cpp to
81c999fe0a25c4ebbfef10ed8a1a96df9cfc10fd
by @localai-bot in #3209 - chore: β¬οΈ Update ggerganov/llama.cpp to
6e02327e8b7837358e0406bf90a4632e18e27846
by @localai-bot in #3212 - chore(deps): update edgevpn by @mudler in #3214
- chore: β¬οΈ Update ggerganov/llama.cpp to
4134999e01f31256b15342b41c4de9e2477c4a6c
by @localai-bot in #3218 - chore(deps): Bump llama-index from 0.10.61 to 0.10.65 in /examples/langchain-chroma by @dependabot in #3225
- chore(deps): Bump langchain-community from 0.2.9 to 0.2.11 in /examples/langchain/langchainpy-localai-example by @dependabot in #3230
- chore(deps): Bump attrs from 23.2.0 to 24.2.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3232
- chore(deps): Bump pyyaml from 6.0.1 to 6.0.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3231
- chore(deps): Bump llama-index from 0.10.59 to 0.10.65 in /examples/chainlit by @dependabot in #3238
- chore: β¬οΈ Update ggerganov/llama.cpp to
fc4ca27b25464a11b3b86c9dbb5b6ed6065965c2
by @localai-bot in #3240 - chore(deps): Bump openai from 1.39.0 to 1.40.5 in /examples/langchain-chroma by @dependabot in #3241
- chore: β¬οΈ Update ggerganov/whisper.cpp to
22fcd5fd110ba1ff592b4e23013d870831756259
by @localai-bot in #3239 - chore(deps): Bump aiohttp from 3.10.2 to 3.10.3 in /examples/langchain/langchainpy-localai-example by @dependabot in #3234
- chore(deps): Bump openai from 1.39.0 to 1.40.6 in /examples/langchain/langchainpy-localai-example by @dependabot in #3244
- chore: β¬οΈ Update ggerganov/llama.cpp to
06943a69f678fb32829ff06d9c18367b17d4b361
by @localai-bot in #3245 - chore(deps): Bump openai from 1.39.0 to 1.40.4 in /examples/functions by @dependabot in #3235
- chore: β¬οΈ Update ggerganov/llama.cpp to
5fd89a70ead34d1a17015ddecad05aaa2490ca46
by @localai-bot in #3248 - chore(deps): bump llama.cpp, rename
llama_add_bos_token
by @mudler in #3253 - chore: β¬οΈ Update ggerganov/llama.cpp to
8b3befc0e2ed8fb18b903735831496b8b0c80949
by @localai-bot in #3257 - chore: β¬οΈ Update ggerganov/llama.cpp to
2fb9267887d24a431892ce4dccc75c7095b0d54d
by @localai-bot in #3260 - chore: β¬οΈ Update ggerganov/llama.cpp to
554b049068de24201d19dde2fa83e35389d4585d
by @localai-bot in #3263 - chore(deps): Bump langchain from 0.2.12 to 0.2.14 in /examples/langchain-chroma by @dependabot in #3275
- chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/openvoice by @dependabot in #3282
- chore(deps): Bump docs/themes/hugo-theme-relearn from
8b14837
to82a5e98
by @dependabot in #3274 - chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/bark by @dependabot in #3285
- chore(deps): Bump grpcio from 1.65.1 to 1.65.5 in /backend/python/parler-tts by @dependabot in #3283
- chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/common/template by @dependabot in #3291
- chore(deps): Bump grpcio from 1.65.1 to 1.65.5 in /backend/python/sentencetransformers by @dependabot in #3292
- chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/vall-e-x by @dependabot in #3294
- chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/transformers by @dependabot in #3296
- chore(deps): Bump grpcio from 1.65.0 to 1.65.5 in /backend/python/exllama by @dependabot in #3299
- chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/vllm by @dependabot in #3301
- chore(deps): Bump langchain from 0.2.12 to 0.2.14 in /examples/functions by @dependabot in #3304
- chore(deps): Bump numpy from 2.0.1 to 2.1.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3310
- chore(deps): Bump grpcio from 1.65.1 to 1.65.5 in /backend/python/mamba by @dependabot in #3313
- chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/coqui by @dependabot in #3306
- chore(deps): Bump grpcio from 1.65.4 to 1.65.5 in /backend/python/transformers-musicgen by @dependabot in #3308
- chore(deps): Bump langchain-community from 0.2.11 to 0.2.12 in /examples/langchain/langchainpy-localai-example by @dependabot in #3311
- chore: β¬οΈ Update ggerganov/llama.cpp to
cfac111e2b3953cdb6b0126e67a2487687646971
by @localai-bot in #3315 - chore(deps): Bump openai from 1.40.4 to 1.41.1 in /examples/functions by @dependabot in #3319
- chore(deps): Bump openai from 1.40.6 to 1.41.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3320
- chore(deps): Bump llama-index from 0.10.65 to 0.10.67.post1 in /examples/langchain-chroma by @dependabot in #3335
- chore(deps): update edgevpn by @mudler in #3340
- chore(deps): Bump langchain from 0.2.12 to 0.2.14 in /examples/langchain/langchainpy-localai-example by @dependabot in #3307
- chore(deps): update edgevpn by @mudler in #3346
- chore: β¬οΈ Update ggerganov/whisper.cpp to
d65786ea540a5aef21f67cacfa6f134097727780
by @localai-bot in #3344 - chore: β¬οΈ Update ggerganov/llama.cpp to
2f3c1466ff46a2413b0e363a5005c46538186ee6
by @localai-bot in #3345 - chore: β¬οΈ Update ggerganov/llama.cpp to
fc54ef0d1c138133a01933296d50a36a1ab64735
by @localai-bot in #3356 - chore: β¬οΈ Update ggerganov/whisper.cpp to
9e3c5345cd46ea718209db53464e426c3fe7a25e
by @localai-bot in #3357
Other Changes
- feat(swagger): update swagger by @localai-bot in #3196
- fix: devcontainer part 1 by @dave-gray101 in #3254
- fix: devcontainer pt 2 by @dave-gray101 in #3258
- feat: devcontainer part 3 by @dave-gray101 in #3318
- feat: devcontainer part 4 by @dave-gray101 in #3339
- feat(swagger): update swagger by @localai-bot in #3343
- chore(anime.js): drop unused by @mudler in #3351
- chore(p2p): single-node when sharing federated instance by @mudler in #3354
New Contributors
Full Changelog: v2.19.4...v2.20.0