From 0a65c54070ad613110b6bd1d4e559695a00e1dff Mon Sep 17 00:00:00 2001 From: Mateusz Charytoniuk Date: Tue, 16 Jul 2024 21:08:44 +0200 Subject: [PATCH] readme: paddlin' --- README.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/README.md b/README.md index c346ce3..aa42d54 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,7 @@ # Paddler + + Paddler is an open-source load balancer and reverse proxy designed to optimize servers running [llama.cpp](https://github.com/ggerganov/llama.cpp). Typical strategies like round robin or least connections are not effective for [llama.cpp](https://github.com/ggerganov/llama.cpp) servers, which need slots for continuous batching and concurrent requests. @@ -179,6 +181,16 @@ StatsD metrics need to be enabled with the following flags: ## Changelog +### v0.4.0 + +Thank you, [@ScottMcNaught](https://github.com/ScottMcNaught), for the help with debugging the issues! :) + +#### Fixes + +- OpenAI compatible endpoint is now properly balanced (`/v1/chat/completions`) +- Balancer's reverse proxy `panic`ked in some scenarios when the underlying `llama.cpp` instance was abruptly closed during the generation of completion tokens +- Added mutex in the targets collection for better internal slots data integrity + ### v0.3.0 #### Features @@ -193,6 +205,12 @@ StatsD metrics need to be enabled with the following flags: * [Aggregated Health Status Responses](https://github.com/distantmagic/paddler/releases/tag/v0.1.0) +## Why the Name + +I initially wanted to use [Raft](https://raft.github.io/) consensus algorithm (thus Paddler, because it paddles on a Raft), but eventually, I dropped that idea. The name stayed, though. + +Later, people started sending me a "that's a paddlin'" clip from The Simpsons, and I just embraced it. + ## Community Discord: https://discord.gg/kysUzFqSCK