Merge branch 'main' into issue-605

dottxt-ai · Mar 6, 2024 · 1bee16d · 1bee16d
2 parents d284bdc + a62ff00
commit 1bee16d
Show file tree

Hide file tree

Showing 70 changed files with 2,921 additions and 1,437 deletions.
diff --git a/.github/workflows/release_docker.yml b/.github/workflows/release_docker.yml
@@ -0,0 +1,36 @@
+name: Release Docker
+
+on:
+  release:
+    types:
+      - created
+  workflow_dispatch:
+    inputs:
+      release_tag:
+        description: 'Release Tag (for manual dispatch)'
+        required: false
+        default: 'latest'
+jobs:
+  release-job:
+    name: Build and publish on Docker Hub
+    runs-on: ubuntu-latest
+    environment: release
+    steps:
+    - name: Checkout
+      uses: actions/checkout@v4
+    - name: Log in to Docker Hub
+      uses: docker/login-action@v3
+      with:
+        username: ${{ secrets.DOCKERHUB_USERNAME }}
+        password: ${{ secrets.DOCKERHUB_TOKEN }}
+    - name: Build and push Docker image
+      uses: docker/build-push-action@v5
+      with:
+        push: true
+        tags: |
+          outlinesdev/outlines:latest
+          outlinesdev/outlines:${{ github.event.release.tag_name || github.event.inputs.release_tag }}
+        build-args: |
+          BUILDKIT_CONTEXT_KEEP_GIT_DIR=true
+    - name: Clean docker cache
+      run: docker system prune --all --force
diff --git a/.github/workflows/release.yml → .github/workflows/release_pypi.yaml b/.github/workflows/release.yml → .github/workflows/release_pypi.yaml
@@ -1,16 +1,16 @@
-name: Release
+name: Release PyPi
 
 on:
   release:
     types:
       - created
-
 jobs:
   release-job:
     name: Build and publish on PyPi
     runs-on: ubuntu-latest
     steps:
-    - uses: actions/checkout@v2
+    - name: Checkout
+      uses: actions/checkout@v2
     - name: Set up Python
       uses: actions/setup-python@v2
       with:

diff --git a/.gitignore b/.gitignore
@@ -4,3 +4,5 @@ __pycache__
 docs/build
 .coverage
 .idea/
+*.gguf
+.venv
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,17 @@
+FROM python:3.10
+
+WORKDIR /outlines
+
+RUN pip install --upgrade pip
+
+# Copy necessary build components
+COPY pyproject.toml .
+COPY outlines ./outlines
+
+# Install outlines and outlines[serve]
+# .git required by setuptools-scm
+RUN --mount=source=.git,target=.git,type=bind \
+    pip install --no-cache-dir .[serve]
+
+# https://outlines-dev.github.io/outlines/reference/vllm/
+ENTRYPOINT python3 -m outlines.serve.serve
diff --git a/README.md b/README.md
@@ -9,12 +9,15 @@
 [![Discord][discord-badge]][discord]
 [![Twitter][twitter-badge]][twitter]
 
-*Robust (guided) text generation.*
+*Robust (structured) text generation.*
+
+🙏 Help us by answering the [developer survey](https://h1xbpbfsf0w.typeform.com/to/EeDhccYI) 🙏
 
 </div>
 
 Made with ❤👷️ by the team at [.txt](https://dottxt.co).
-We'd love to have your [feedback][discord]!
+
+Looking for an API that returns valid JSON? [Give .json a try](https://h1xbpbfsf0w.typeform.com/to/ZgBCvJHF) ✨
 
 ``` bash
 pip install outlines
@@ -27,19 +30,20 @@ First time here? Go to our [setup guide](https://outlines-dev.github.io/outlines
 - [x] 🤖 [Multiple model integrations](https://outlines-dev.github.io/outlines/installation): OpenAI, transformers, llama.cpp, exllama2, mamba
 - [x] 🖍️ Simple and powerful prompting primitives based on the [Jinja templating engine](https://jinja.palletsprojects.com/)
 - [x] 🚄 [Multiple choices](#multiple-choices), [type constraints](#type-constraint) and dynamic stopping
-- [x] ⚡ Fast [regex-guided generation](#efficient-regex-guided-generation)
+- [x] ⚡ Fast [regex-structured generation](#efficient-regex-structured-generation)
 - [x] 🔥 Fast [JSON generation](#efficient-json-generation-following-a-pydantic-model) following a JSON schema or a Pydantic model
-- [x] 📝 [Grammar-guided generation](#using-context-free-grammars-to-guide-generation)
+- [x] 📝 [Grammar-structured generation](#using-context-free-grammars-to-guide-generation)
 - [x] 🐍 Interleave completions with loops, conditionals, and custom Python functions
 - [x] 💾 Caching of generations
 - [x] 🗂️ Batch inference
-- [x] 🚀 [Serve with vLLM](https://outlines-dev.github.io/outlines/reference/vllm)
+- [x] 🎲 Sample with the greedy, multinomial and beam search algorithms (and more to come!)
+- [x] 🚀 [Serve with vLLM](https://outlines-dev.github.io/outlines/reference/vllm), with official Docker image, [`outlinesdev/outlines`](https://hub.docker.com/r/outlinesdev/outlines)!
 
 
 Outlines 〰 has new releases and features coming every week. Make sure to ⭐ star and 👀 watch this repository, follow [@dottxtai][twitter] to stay up to date!
 
 
-## Guided generation
+## Structured generation
 
 The first step towards reliability of systems that include large language models
 is to ensure that there is a well-defined interface between their output and
@@ -53,7 +57,7 @@ You can reduce the completion to a choice between multiple possibilities:
 ``` python
 import outlines
 
-model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
+model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
 
 prompt = """You are a sentiment-labelling assistant.
 Is the following review positive or negative?
@@ -73,53 +77,56 @@ You can instruct the model to only return integers or floats:
 ``` python
 import outlines
 
-model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
+model = outlines.models.transformers("WizardLM/WizardMath-7B-V1.1")
 
-prompt = "1+1="
+prompt = "<s>result of 9 + 9 = 18</s><s>result of 1 + 2 = "
 answer = outlines.generate.format(model, int)(prompt)
+print(answer)
+# 3
 
 prompt = "sqrt(2)="
-
 generator = outlines.generate.format(model, float)
-answer = generator(prompt)
+answer = generator(prompt, max_tokens=10)
+print(answer)
+# 1.41421356
 ```
 
-### Efficient regex-guided generation
+### Efficient regex-structured generation
 
-Outlines also comes with fast regex-guided generation. In fact, the `choice` and
-`format` functions above all use regex-guided generation under the
+Outlines also comes with fast regex-structured generation. In fact, the `choice` and
+`format` functions above all use regex-structured generation under the
 hood:
 
 ``` python
 import outlines
 
-model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
+model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
 
 prompt = "What is the IP address of the Google DNS servers? "
 
 generator = outlines.generate.text(model)
-unguided = generator(prompt, max_tokens=30)
+unstructured = generator(prompt, max_tokens=30)
 
 generator = outlines.generate.regex(
     model,
     r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)",
 )
-guided = generator(prompt, max_tokens=30)
+structured = generator(prompt, max_tokens=30)
 
-print(unguided)
+print(unstructured)
 # What is the IP address of the Google DNS servers?
 #
 # Passive DNS servers are at DNS servers that are private.
 # In other words, both IP servers are private. The database
 # does not contain Chelsea Manning
 
-print(guided)
+print(structured)
 # What is the IP address of the Google DNS servers?
 # 2.2.6.1
 ```
 
-Unlike other libraries, regex-guided generation in Outlines is almost as fast
-as non-guided generation.
+Unlike other libraries, regex-structured generation in Outlines is almost as fast
+as non-structured generation.
 
 ### Efficient JSON generation following a Pydantic model
 
@@ -156,34 +163,24 @@ class Character(BaseModel):
     strength: int
 
 
-model = outlines.models.transformers("mistralai/Mistral-7B-v0.1", device="cuda")
+model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
 
-# Construct guided sequence generator
-generator = outlines.generate.json(model, Character, max_tokens=100)
+# Construct structured sequence generator
+generator = outlines.generate.json(model, Character)
 
 # Draw a sample
 rng = torch.Generator(device="cuda")
 rng.manual_seed(789001)
 
-sequence = generator("Give me a character description", rng=rng)
-print(sequence)
-# {
-#   "name": "clerame",
-#   "age": 7,
-#   "armor": "plate",
-#   "weapon": "mace",
-#   "strength": 4171
-# }
-
-sequence = generator("Give me an interesting character description", rng=rng)
-print(sequence)
-# {
-#   "name": "piggyback",
-#   "age": 23,
-#   "armor": "chainmail",
-#   "weapon": "sword",
-#   "strength": 0
-# }
+character = generator("Give me a character description", rng=rng)
+
+print(repr(character))
+# Character(name='Anderson', age=28, armor=<Armor.chainmail: 'chainmail'>, weapon=<Weapon.sword: 'sword'>, strength=8)
+
+character = generator("Give me an interesting character description", rng=rng)
+
+print(repr(character))
+# Character(name='Vivian Thr', age=44, armor=<Armor.plate: 'plate'>, weapon=<Weapon.crossbow: 'crossbow'>, strength=125)
 ```
 
 The method works with union types, optional types, arrays, nested schemas, etc. Some field constraints are [not supported yet](https://github.com/outlines-dev/outlines/issues/215), but everything else should work.
@@ -232,9 +229,9 @@ schema = '''{
     }
 }'''
 
-model = outlines.models.transformers("mistralai/Mistral-7B-v0.1", device="cuda")
+model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
 generator = outlines.generate.json(model, schema)
-sequence = generator("Give me a character description")
+character = generator("Give me a character description")
 ```
 
 ### Using context-free grammars to guide generation
@@ -245,37 +242,28 @@ Formal grammars rule the world, and Outlines makes them rule LLMs too. You can p
 import outlines
 
 arithmetic_grammar = """
-    ?start: sum
+    ?start: expression
 
-    ?sum: product
-        | sum "+" product   -> add
-        | sum "-" product   -> sub
+    ?expression: term (("+" | "-") term)*
 
-    ?product: atom
-        | product "*" atom  -> mul
-        | product "/" atom  -> div
+    ?term: factor (("*" | "/") factor)*
 
-    ?atom: NUMBER           -> number
-         | "-" atom         -> neg
-         | "(" sum ")"
+    ?factor: NUMBER
+           | "-" factor
+           | "(" expression ")"
 
     %import common.NUMBER
-    %import common.WS_INLINE
-
-    %ignore WS_INLINE
 """
 
-model = outlines.models.transformers("mistralai/Mistral-7B-v0.1", device="cuda")
+model = outlines.models.transformers("WizardLM/WizardMath-7B-V1.1")
 generator = outlines.generate.cfg(model, arithmetic_grammar)
-sequence = generator("Write a formula that returns 5 using only additions and subtractions.")
-
-# It looks like Mistral is not very good at arithmetics :)
+sequence = generator("Alice had 4 apples and Bob ate 2. Write an expression for Alice's apples:")
 
 print(sequence)
-# 1+3-2-4+5-7+8-6+9-6+4-2+3+5-1+1
+# (8-2)
 ```
 
-This was a very simple grammar, and you can use `outlines.generate.cfg` to generate syntactically valid Python, SQL, and much more than this. Any kind of structured text, really. All you have to do is search for "X EBNF grammar" on the web, and take a look at the [Outlines Grammars repository](https://github.com/outlines-dev/grammars).
+This was a very simple grammar, and you can use `outlines.generate.cfg` to generate syntactically valid Python, SQL, and much more than this. Any kind of structured text, really. All you have to do is search for "X EBNF grammar" on the web, and take a look at the [Outlines `grammars` module](https://github.com/outlines-dev/outlines/tree/main/outlines/grammars).
 
 ### Open functions
 
@@ -288,9 +276,9 @@ import outlines
 def add(a: int, b: int):
     return a + b
 
-model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
+model = outlines.models.transformers("WizardLM/WizardMath-7B-V1.1")
 generator = outlines.generate.json(model, add)
-result = generator("Return two integers named a and b respectively. a is odd and b even.")
+result = generator("Return json with two integers named a and b respectively. a is odd and b even.")
 
 print(add(**result))
 # 3
@@ -329,7 +317,7 @@ def labelling(to_label, examples):
     {{ to_label }} //
     """
 
-model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
+model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
 prompt = labelling("Just awesome", examples)
 answer = outlines.generate.text(model)(prompt, max_tokens=100)
 ```

diff --git a/docs/api/samplers.md b/docs/api/samplers.md
@@ -1 +1 @@
-::: outlines.generate.samplers
+::: outlines.samplers
diff --git a/docs/assets/images/dottxt.png b/docs/assets/images/dottxt.png
diff --git a/docs/blog/posts/roadmap-2024.md b/docs/blog/posts/roadmap-2024.md
@@ -16,12 +16,12 @@ Outlines is not even one year old and it's already gone a long way! As we just r
 
 Before delving into [the detailed roadmap](#detailed-roadmap), let me share a few thoughts and explain the general direction of the library. These thoughts are informed with my multiple interactions with users, either on [Twitter](https://twitter.com/remilouf) or in our [Discord server](https://discord.gg/ZxBxyWmW5n).
 
-*Outlines currently differentiates itself* from other libraries with its efficient JSON- and regex- constrained generation. A user-facing interface for grammar-guided generation (it had been hidden in the repository) was also recently added. But there is much more we can do along these lines. In 2024 will we will keep pushing in the direction of more accurate, faster constrained generation.
+*Outlines currently differentiates itself* from other libraries with its efficient JSON- and regex- constrained generation. A user-facing interface for grammar-structured generation (it had been hidden in the repository) was also recently added. But there is much more we can do along these lines. In 2024 will we will keep pushing in the direction of more accurate, faster constrained generation.
 
 Outlines also supports many models providers: `transformers`, `mamba`, `llama.cpp` and `exllama2`. Those *integrations represent a lot of maintenance*, and we will need to simplify them. For instance, `transformers` now supports quantized models, and we will soon deprecate the support for `autoawq` and `autogptq`.
 Thanks to a refactor of the library, it is now possible to use our constrained generation method by using logits processor with all other libraries, except `mamba`. We will look for libraries that provide state-space models and allow to pass a logits processor during inference. We will interface with `llama.cpp` and `exllama2` using logits processors.
 
-*We would like expand our work to the whole sampling layer*, and add new sampling methods that should make guided generation more accurate. This means we will keep the `transformers` integration as it is today and will expand our text generation logic around this library.
+*We would like expand our work to the whole sampling layer*, and add new sampling methods that should make structured generation more accurate. This means we will keep the `transformers` integration as it is today and will expand our text generation logic around this library.
 
 Making workflows re-usable and easy to share is difficult today. That is why *we are big believers in [outlines functions](https://github.com/outlines-dev/functions)*. We will keep improving the interface and adding examples.
 
@@ -49,12 +49,12 @@ We want to keep the current integrations but lower the maintenance cost so we ca
 * Integrate with llama.cpp via a logits processor;
 * Integrate with exllamav2 via a logits processor;
 
-### Push guided generation further
+### Push structured generation further
 
 We're just getting started!
 
-* Improve the performance of existing guided generation algorithms;
-* Improve the correctness of guided generation algorithms;
+* Improve the performance of existing structured generation algorithms;
+* Improve the correctness of structured generation algorithms;
 * Add ready-to-use grammars in the [grammars](https://github.com/outlines-dev/grammars) repository or in a submodule in Outlines.
 
 ### Keep developing Outlines functions
@@ -64,12 +64,12 @@ Functions are awesome, use them!
 * Implement a CLI `outlines serve` that allows to serve Outlines functions locally;
 * Add more functions to the [functions](https://github.com/outlines-dev/functions) repository.
 
-### Serve guided generation
+### Serve structured generation
 
-We want to make it easier to serve guided generation and outlines functions.
+We want to make it easier to serve structured generation and outlines functions.
 
 * Implement the outlines serve CLI `outlines serve`
-  - Serve local APIs that perform guided generation;
+  - Serve local APIs that perform structured generation;
   - Serve Outlines functions.
 
 ### Improve the generation layer

diff --git a/docs/community/contribute.md b/docs/community/contribute.md
@@ -42,6 +42,15 @@ pip install -e .[test]
 pre-commit install
 ```
 
+#### Developing Serve Endpoint Via Docker
+
+```bash
+docker build -t outlines-serve .
+docker run -p 8000:8000 outlines-serve --model="mistralai/Mistral-7B-Instruct-v0.2"
+```
+
+This builds `outlines-serve` and runs on `localhost:8000` with the model `Mistral-7B-Instruct-v0.2`
+
 ### Before pushing your code
 
 Run the tests: