Fix details in the documentation

dottxt-ai · Aug 13, 2024 · 90e588c · 90e588c
1 parent c2d92b7
commit 90e588c
Show file tree

Hide file tree

Showing 13 changed files with 51 additions and 83 deletions.
diff --git a/docs/cookbook/chain_of_thought.md b/docs/cookbook/chain_of_thought.md
@@ -7,13 +7,13 @@ In this guide, we use [outlines](https://outlines-dev.github.io/outlines/) to ap
 
 We use [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) library. Outlines supports llama-cpp-python, but we need to install it ourselves:
 
-```shell
+```bash
 pip install llama-cpp-python
 ```
 
 We pull a quantized GGUF model, in this guide we pull [Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF) by [NousResearch](https://nousresearch.com/) from [HuggingFace](https://huggingface.co/):
 
-```shell
+```bash
 wget https://hf.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/resolve/main/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
 ```
 

diff --git a/docs/cookbook/dating_profiles.md b/docs/cookbook/dating_profiles.md
@@ -170,7 +170,7 @@ parsed_profile = DatingProfile.model_validate_json(profile)
 
 Here are a couple of results:
 
-```
+```json
 {
     "bio": """I'm an ambitious lawyer with a casual and fashionable style. I love
     games and sports, but my true passion is preparing refreshing cocktails at
@@ -199,7 +199,7 @@ Here are a couple of results:
 }
 ```
 
-```
+```json
 {
     "bio": """I’m a sexy lawyer with time on my hands. I love to game and
     play ping pong, but the real reason you should swipe to the right

diff --git a/docs/cookbook/index.md b/docs/cookbook/index.md
@@ -1,5 +1,7 @@
 # Examples
 
+This part of the documentation provides a few cookbooks that you can browse to get acquainted with the library and get some inspiration about what you could do with structured generation. Remember that you can easily change the model that is being used!
+
 - [Classification](classification.md): Classify customer requests.
 - [Named Entity Extraction](extraction.md): Extract information from pizza orders.
 - [Dating Profile](dating_profiles.md): Build dating profiles from descriptions using prompt templating and JSON-structured generation.

diff --git a/docs/cookbook/knowledge_graph_extraction.md b/docs/cookbook/knowledge_graph_extraction.md
@@ -4,13 +4,13 @@ In this guide, we use [outlines](https://outlines-dev.github.io/outlines/) to ex
 
 We will use [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) library. Outlines supports llama-cpp-python, but we need to install it ourselves:
 
-```shell
+```bash
 pip install llama-cpp-python
 ```
 
 We pull a quantized GGUF model, in this guide we pull [Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF) by [NousResearch](https://nousresearch.com/) from [HuggingFace](https://huggingface.co/):
 
-```shell
+```bash
 wget https://hf.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/resolve/main/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
 ```
 

diff --git a/docs/cookbook/qa-with-citations.md b/docs/cookbook/qa-with-citations.md
@@ -4,13 +4,13 @@ This tutorial is adapted from the [instructor-ollama notebook](https://github.co
 
 We will use [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) library. Outlines supports llama-cpp-python, but we need to install it ourselves:
 
-```shell
+```bash
 pip install llama-cpp-python
 ```
 
 We pull a quantized GGUF model [Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF) by [NousResearch](https://nousresearch.com/) from [HuggingFace](https://huggingface.co/):
 
-```shell
+```bash
 wget https://hf.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/resolve/main/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
 ```
 

diff --git a/docs/cookbook/react_agent.md b/docs/cookbook/react_agent.md
@@ -8,13 +8,13 @@ Additionally, we give the LLM the possibility of using a scratchpad described in
 
 We use [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) library. Outlines supports llama-cpp-python, but we need to install it ourselves:
 
-```shell
+```bash
 pip install llama-cpp-python
 ```
 
 We pull a quantized GGUF model, in this guide we pull [Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF) by [NousResearch](https://nousresearch.com/) from [HuggingFace](https://huggingface.co/):
 
-```shell
+```bash
 wget https://hf.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/resolve/main/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
 ```
 
@@ -55,9 +55,8 @@ def wikipedia(q):
         "srsearch": q,
         "format": "json"
     }).json()["query"]["search"][0]["snippet"]
-```
 
-```python
+
 def calculate(numexp):
     return eval(numexp)
 ```

diff --git a/docs/reference/generation/generation.md b/docs/reference/generation/generation.md
@@ -208,3 +208,9 @@ result = generator("What is 2+2?")
 print(result)
 # 4
 ```
+
+
+[jsonschema]: https://json-schema.org/learn/getting-started-step-by-step
+[pydantic]: https://docs.pydantic.dev/latest
+[cfg]: https://en.wikipedia.org/wiki/Context-free_grammar
+[ebnf]: https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form
diff --git a/docs/reference/generation/structured_generation_explanation.md b/docs/reference/generation/structured_generation_explanation.md
@@ -1,8 +1,4 @@
----
-title: Structured Generation Explanation
----
-
-# Structured Generation Explanation
+# How does Outlines work?
 
 
 Language models generate text token by token, using the previous token sequence as input and sampled logits as output. This document explains the structured generation process, where only legal tokens are considered for the next step based on a predefined automata, e.g. a regex-defined [finite-state machine](https://en.wikipedia.org/wiki/Finite-state_machine) (FSM) or [Lark](https://lark-parser.readthedocs.io/en/stable/) grammar.`

diff --git a/docs/reference/models/models.md b/docs/reference/models/models.md
@@ -4,60 +4,20 @@ title: Models
 
 # Models
 
-Outlines supports generation using a number of inference engines (`outlines.models`)
-
-Loading a model using outlines follows a similar interface between inference engines.
+Outlines supports generation using a number of inference engines (`outlines.models`). Loading a model using outlines follows a similar interface between inference engines:
 
 ```python
 import outlines
-```
-
-## [Transformers](./transformers.md)
-
-```python
-model = outlines.models.transformers("microsoft/Phi-3-mini-128k-instruct", model_kwargs={})
-```
-
-For additional arguments and use of other Huggingface Transformers model types see [Outlines' Transformers documentation](./transformers.md).
-
 
-## [Transformers Vision](./transformers_vision.md)
-
-```python
+model = outlines.models.transformers("microsoft/Phi-3-mini-128k-instruct")
 model = outlines.models.transformers_vision("llava-hf/llava-v1.6-mistral-7b-hf")
-```
-
-For examples of generation and other details, see [Outlines' Transformers Vision documentation](./transformers_vision.md).
-
-## [vLLM](./vllm.md)
-
-```python
 model = outlines.models.vllm("microsoft/Phi-3-mini-128k-instruct")
-```
-
-## [llama.cpp](./llamacpp.md)
-
-```python
-model = outlines.models.llamacpp("microsoft/Phi-3-mini-4k-instruct-gguf", "Phi-3-mini-4k-instruct-q4.gguf")
-```
-
-Additional llama.cpp parameters can be found in the [Outlines' llama.cpp documentation](./llamacpp.md).
-
-## [ExLlamaV2](./exllamav2.md)
-
-```python
+model = outlines.models.llamacpp(
+    "microsoft/Phi-3-mini-4k-instruct-gguf", "Phi-3-mini-4k-instruct-q4.gguf"
+)
 model = outlines.models.exllamav2("bartowski/Phi-3-mini-128k-instruct-exl2")
-```
-
-## [MLXLM](./mlxlmx.md)
-
-```python
 model = outlines.models.mlxlm("mlx-community/Phi-3-mini-4k-instruct-4bit")
-```
 
-## [OpenAI](./openai.md)
-
-```python
 model = outlines.models.openai(
     "gpt-4o-mini",
     api_key=os.environ["OPENAI_API_KEY"]
@@ -66,7 +26,7 @@ model = outlines.models.openai(
 
 
 # Feature Matrix
-|                   | Transformers | Transformers Vision | vLLM | llama.cpp | ExLlamaV2 | MLXLM | OpenAI* |
+|                   | [Transformers](transformers.md) | [Transformers Vision](transformers_vision.md) | [vLLM](vllm.md) | [llama.cpp](llamacpp.md) | [ExLlamaV2](exllamav2.md) | [MLXLM](mlxlm.md) | [OpenAI](openai.md)* |
 |-------------------|--------------|---------------------|------|-----------|-----------|-------|---------|
 | **Device**        |              |                     |      |           |           |       |         |
 | Cuda              | ✅           | ✅                  | ✅   | ✅        | ✅        | ❌    | N/A     |

diff --git a/docs/reference/models/transformers.md b/docs/reference/models/transformers.md
@@ -33,14 +33,15 @@ model = models.Transformers(llm, tokenizer)
 # Using Logits Processors
 
 There are two ways to use Outlines Structured Generation with HuggingFace Transformers:
-- 1) Use Outlines generation wrapper, `outlines.models.transformers`
-- 2) Use `OutlinesLogitsProcessor` with `transformers.AutoModelForCausalLM`
+
+1. Use Outlines generation wrapper, `outlines.models.transformers`
+2. Use `OutlinesLogitsProcessor` with `transformers.AutoModelForCausalLM`
 
 Outlines supports a myriad of logits processors for structured generation. In these example, we will use the `RegexLogitsProcessor` which guarantees generated text matches the specified pattern.
 
-## Example: `outlines.models.transformers`
+## Using `outlines.models.transformers`
 
-```
+```python
 import outlines
 
 time_regex_pattern = r"(0?[1-9]|1[0-2]):[0-5]\d\s?(am|pm)?"
@@ -53,9 +54,9 @@ print(output)
 # 2:30 pm
 ```
 
-## Example: Direct `transformers` library use
+## Using models initialized via the `transformers`  library
 
-```
+```python
 import outlines
 import transformers
 
@@ -117,8 +118,9 @@ model = outlines.models.transformers(
 )
 ```
 
-Further Reading:
-- https://huggingface.co/docs/transformers/en/model_doc/mamba
+
+
+Read [`transformers`'s documentation](https://huggingface.co/docs/transformers/en/model_doc/mamba) for more information.
 
 ### Encoder-Decoder Models
 
@@ -144,8 +146,3 @@ model_bart = models.transformers(
     model_class=AutoModelForSeq2SeqLM,
 )
 ```
-
-
-### Multi-Modal Models
-
-/Coming soon/
diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css
@@ -96,6 +96,14 @@
   background: #FFFFFF ! important
 }
 
+.language-text {
+  background: #FFFFFF ! important
+}
+
+.language-json {
+  background: #FFFFFF ! important
+}
+
 h1.title {
   color: #FFFFFF;
   margin: 0px 0px 5px;

diff --git a/docs/welcome.md b/docs/welcome.md
@@ -6,7 +6,7 @@ Outlines〰 is a Python library that allows you to use Large Language Model in a
 
 ## What models do you support?
 
-We support [Openai](reference/models/openai.md), but the true power of Outlines〰 is unleashed with Open Source models available via the [transformers](reference/models/transformers.md), [llama.cpp](reference/models/llamacpp.md), [exllama2](reference/models/exllamav2.md) and [mamba_ssm](reference/models/mamba.md) libraries. If you want to build and maintain an integration with another library, [get in touch][discord].
+We support [Openai](reference/models/openai.md), but the true power of Outlines〰 is unleashed with Open Source models available via the [transformers](reference/models/transformers.md), [llama.cpp](reference/models/llamacpp.md), [exllama2](reference/models/exllamav2.md), [mlx-lm](reference/models/mlxlm.md) and [vllm](reference/models/vllm.md) models. If you want to build and maintain an integration with another library, [get in touch][discord].
 
 ## What are the main features?
 
@@ -17,7 +17,7 @@ We support [Openai](reference/models/openai.md), but the true power of Outlines
 
     No more invalid JSON outputs, 100% guaranteed
 
-    [:octicons-arrow-right-24: Generate JSON](reference/json.md)
+    [:octicons-arrow-right-24: Generate JSON](reference/generation/json.md)
 
 -   :material-keyboard-outline:{ .lg .middle } __JSON mode for vLLM__
 
@@ -34,7 +34,7 @@ We support [Openai](reference/models/openai.md), but the true power of Outlines
 
     Generate text that parses correctly 100% of the time
 
-    [:octicons-arrow-right-24: Guide LLMs](reference/regex.md)
+    [:octicons-arrow-right-24: Guide LLMs](reference/generation/regex.md)
 
 -    :material-chat-processing-outline:{ .lg .middle } __Powerful Prompt Templating__
 

diff --git a/mkdocs.yml b/mkdocs.yml
@@ -121,24 +121,24 @@ nav:
   - Docs:
     - reference/index.md
     - Generation:
-        - Generation Overview: reference/generation/generation.md
+        - Overview: reference/generation/generation.md
         - Text: reference/text.md
         - Samplers: reference/samplers.md
         - Structured generation:
+            - How does it work?: reference/generation/structured_generation_explanation.md
             - Classification: reference/generation/choices.md
             - Regex: reference/generation/regex.md
             - Type constraints: reference/generation/format.md
             - JSON (function calling): reference/generation/json.md
             - Grammar: reference/generation/cfg.md
             - Custom FSM operations: reference/generation/custom_fsm_ops.md
-            - Structured Generation Technical Explanation: reference/generation/structured_generation_explanation.md
     - Utilities:
         - Serve with vLLM: reference/serve/vllm.md
-        - Custom types: reference/types.md
+        - Custom types: reference/generation/types.md
         - Prompt templating: reference/prompting.md
         - Outlines functions: reference/functions.md
     - Models:
-        - Models Overview: reference/models/models.md
+        - Overview: reference/models/models.md
         - Open source:
           - Transformers: reference/models/transformers.md
           - Transformers Vision: reference/models/transformers_vision.md