From d42b1eb0deeb2f97858ca647200731f57adce540 Mon Sep 17 00:00:00 2001 From: Zhanghao Wu Date: Tue, 19 Mar 2024 00:24:36 +0000 Subject: [PATCH] fix lorax logo --- llm/lorax/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/llm/lorax/README.md b/llm/lorax/README.md index cd9ffb171f7..2fe548c92a8 100644 --- a/llm/lorax/README.md +++ b/llm/lorax/README.md @@ -4,7 +4,7 @@

- LoRAX + LoRAX

[LoRAX](https://github.com/predibase/lorax) (LoRA eXchange) is a framework that allows users to serve thousands of fine-tuned LLMs on a single GPU, dramatically reducing the cost of serving without compromising on throughput or latency. It works by dynamically loading multiple fine-tuned "adapters" (LoRAs, etc.) on top of a single base model at runtime. Concurrent requests for different adapters can be processed together in a single batch, allowing LoRAX to maintain near linear throughput scaling as the number of adapters increases.