Skip to content

Commit

Permalink
Update README.md (#28)
Browse files Browse the repository at this point in the history
  • Loading branch information
geoffreyangus authored Nov 16, 2023
1 parent cf6fbd8 commit a9426bb
Showing 1 changed file with 14 additions and 6 deletions.
20 changes: 14 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,12 @@ Lorax is a framework that allows users to serve over a hundred fine-tuned models
-**Production Readiness** reliably stable, Lorax supports Prometheus metrics and distributed tracing with Open Telemetry
- 🤯 **Free Commercial Use:** Apache 2.0 License. Enough said 😎.


<p align="center">
<img src="https://github.com/predibase/lorax/assets/29719151/6f4f78fc-c1e9-4a01-8675-dbafa74a2534" />
</p>


## 🏠 Optimized architectures

- 🦙 [Llama V2](https://huggingface.co/meta-llama)
Expand All @@ -56,7 +62,7 @@ or
The easiest way of getting started is using the official Docker container:

```shell
model=mistralai/Mistral-7B-v0.1
model=mistralai/Mistral-7B-Instruct-v0.1
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run

docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/lorax-inference:0.9.4 --model-id $model
Expand All @@ -73,14 +79,14 @@ You can then query the model using either the `/generate` or `/generate_stream`
```shell
curl 127.0.0.1:8080/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"adapter_id":"some/adapter"}}' \
-d '{"inputs": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?", "parameters": {"adapter_id": "vineetsharma/qlora-adapter-Mistral-7B-Instruct-v0.1-gsm8k"}}' \
-H 'Content-Type: application/json'
```

```shell
curl 127.0.0.1:8080/generate_stream \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"adapter_id":"some/adapter"}}' \
-d '{"inputs": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?", "parameters": {"adapter_id": "vineetsharma/qlora-adapter-Mistral-7B-Instruct-v0.1-gsm8k"}}' \
-H 'Content-Type: application/json'
```

Expand All @@ -94,10 +100,12 @@ pip install lorax-client
from lorax import Client

client = Client("http://127.0.0.1:8080")
print(client.generate("What is Deep Learning?", adapter_id="some/adapter").generated_text)
prompt = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"

print(client.generate(prompt, adapter_id="vineetsharma/qlora-adapter-Mistral-7B-Instruct-v0.1-gsm8k").generated_text)

text = ""
for response in client.generate_stream("What is Deep Learning?", adapter_id="some/adapter"):
for response in client.generate_stream(prompt, adapter_id="vineetsharma/qlora-adapter-Mistral-7B-Instruct-v0.1-gsm8k"):
if not response.token.special:
text += response.token.text
print(text)
Expand All @@ -109,4 +117,4 @@ You can consult the OpenAPI documentation of the `lorax` REST API using the `/do

### 🛠️ Local install

MAGDY AND WAEL TODO
MAGDY AND WAEL TODO

0 comments on commit a9426bb

Please sign in to comment.