From 32fac21afa30c8e19e3d4f938ae930d1184910bb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?R=C3=A9mi=20Louf?= <remilouf@gmail.com>
Date: Mon, 10 Jun 2024 16:36:23 +0200
Subject: [PATCH] Update the doc for OpenAI models

---
 docs/reference/models/openai.md | 147 +++++++++++++++++++++++++-------
 mkdocs.yml                      |  14 +--
 2 files changed, 126 insertions(+), 35 deletions(-)

diff --git a/docs/reference/models/openai.md b/docs/reference/models/openai.md
index 07357a360..3e2f717e5 100644
--- a/docs/reference/models/openai.md
+++ b/docs/reference/models/openai.md
@@ -1,69 +1,137 @@
-# Generate text with the OpenAI and compatible APIs
+# OpenAI and compatible APIs
 
 !!! Installation
 
     You need to install the `openai` and `tiktoken` libraries to be able to use the OpenAI API in Outlines.
 
-Outlines supports models available via the OpenAI Chat API, e.g. ChatGPT and GPT-4. The following models can be used with Outlines:
+## OpenAI models
+
+Outlines supports models available via the OpenAI Chat API, e.g. ChatGPT and GPT-4. You can initialize the model by passing the model name to `outlines.models.openai`:
 
 ```python
 from outlines import models
 
+
 model = models.openai("gpt-3.5-turbo")
-model = models.openai("gpt-4")
+model = models.openai("gpt-4-turbo")
+model = models.openai("gpt-4o")
+```
+
+Check the [OpenAI documentation](https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4) for an up-to-date list of available models. You can pass any parameter you would pass to `openai.AsyncOpenAI` as keyword arguments:
+
+```python
+import os
+from outlines import models
+
 
-print(type(model))
-# OpenAI
+model = models.openai(
+    "gpt-3.5-turbo",
+    api_key=os.environ("OPENAI_API_KEY")
+)
 ```
 
-Outlines also supports Azure OpenAI models:
+The following table enumerates the possible parameters. Refer to the [OpenAI SDK's code](https://github.com/openai/openai-python/blob/54a5911f5215148a0bdeb10e2bcfb84f635a75b9/src/openai/_client.py) for an up-to-date list.
+
+**Parameters:**
+
+| **Parameters** | **Type** | **Description** | **Default** |
+|----------------|:---------|:----------------|:------------|
+| `api_key` | `str` | OpenAI API key. Infered from `OPENAI_API_KEY` if not specified | `None` |
+| `organization` | `str` | OpenAI organization id. Infered from `OPENAI_ORG_ID` if not specified | `None` |
+| `project` | `str` | OpenAI project id. Infered from `OPENAI_PROJECT_ID` if not specified.| `None` |
+| `base_url` | `str | https.URL` | Base URL for the endpoint. Infered from `OPENAI_BASE_URL` if no specified. | `None` |
+| `timeout` | `float` | Request timeout.| `NOT_GIVEN` |
+| `max_retries` | `int` | Maximum number of retries for failing requests | `2` |
+| `default_headers` | `Mapping[str, str]` | Default HTTP headers | `None` |
+| `default_query` | `Mapping[str, str]` | Custom parameters added to the HTTP queries | `None` |
+| `http_client` | `https.AsyncClient` | User-specified `httpx` client | `None` |
 
+## Azure OpenAI models
+
+Outlines also supports Azure OpenAI models:
 
 ```python
 from outlines import models
 
+
 model = models.azure_openai(
+    "azure-deployment-name",
+    "gpt-3.5-turbo",
     api_version="2023-07-01-preview",
     azure_endpoint="https://example-endpoint.openai.azure.com",
 )
 ```
 
-More generally, you can use any API client compatible with the OpenAI interface by passing an instance of the client, a configuration, and optionally the corresponding tokenizer (if you want to be able to use `outlines.generate.choice`):
+!!! Question "Why do I need to specify model and deployment name?"
 
-```python
-from openai import AsyncOpenAI
-import tiktoken
+    The model name is needed to load the correct tokenizer for the model. The tokenizer is necessary for structured generation.
 
-from outlines.models.openai import OpenAI, OpenAIConfig
 
-config = OpenAIConfig(model="gpt-4")
-client = AsyncOpenAI()
-tokenizer = tiktoken.encoding_for_model("gpt-4")
+You can pass any parameter you would pass to `openai.AsyncAzureOpenAI`. You can consult the [OpenAI SDK's code](https://github.com/openai/openai-python/blob/54a5911f5215148a0bdeb10e2bcfb84f635a75b9/src/openai/lib/azure.py) for an up-to-date list.
 
-model = OpenAI(client, config, tokenizer)
-```
+**Parameters:**
 
 
-## Monitoring API use
+| **Parameters** | **Type** | **Description** | **Default** |
+|----------------|:---------|:----------------|:------------|
+| `azure_endpoint` | `str` | Azure endpoint, including the resource. Infered from `AZURE_OPENAI_ENDPOINT` if not specified | `None` |
+| `api_version` | `str` | API version. Infered from `AZURE_OPENAI_API_KEY` if not specified | `None` |
+| `api_key` | `str` | OpenAI API key. Infered from `OPENAI_API_KEY` if not specified | `None` |
+| `azure_ad_token` | `str` | Azure active directory token. Inference from `AZURE_OPENAI_AD_TOKEN` if not specified | `None` |
+| `azure_ad_token_provider` | `AzureADTokenProvider` | A function that returns an Azure Active Directory token | `None` |
+| `organization` | `str` | OpenAI organization id. Infered from `OPENAI_ORG_ID` if not specified | `None` |
+| `project` | `str` | OpenAI project id. Infered from `OPENAI_PROJECT_ID` if not specified.| `None` |
+| `base_url` | `str | https.URL` | Base URL for the endpoint. Infered from `OPENAI_BASE_URL` if not specified. | `None` |
+| `timeout` | `float` | Request timeout.| `NOT_GIVEN` |
+| `max_retries` | `int` | Maximum number of retries for failing requests | `2` |
+| `default_headers` | `Mapping[str, str]` | Default HTTP headers | `None` |
+| `default_query` | `Mapping[str, str]` | Custom parameters added to the HTTP queries | `None` |
+| `http_client` | `https.AsyncClient` | User-specified `httpx` client | `None` |
 
-It is important to be able to track your API usage when working with OpenAI's API. The number of prompt tokens and completion tokens is directly accessible via the model instance:
+## Models that follow the OpenAI standard
 
-```python
-import outlines.models
+Outlines supports models that follow the OpenAI standard. You will need to initialize the OpenAI client properly configured and pass it to `outlines.models.openai`
 
-model = models.openai("gpt-4")
+```python
+import os
+from openai import AsyncOpenAI
+from outlines import models
+from outlines.models.openai import OpenAIConfig
 
-print(model.prompt_tokens)
-# 0
 
-print(model.completion_tokens)
-# 0
+client = AsyncOpenAI(
+    api_key=os.environ.get("PROVIDER_KEY"),
+    base_url="http://other.provider.server.com"
+)
+config = OpenAIConfig("model_name")
+model = models.openai(client, config)
 ```
 
-These numbers are updated every time you call the model.
+!!! Warning
+
+    You need to pass the async client to be able to do batch inference.
+
+## Advanced configuration
+
+For more advanced configuration option, such as support proxy, please consult the [OpenAI SDK's documentation](https://github.com/openai/openai-python):
+
+
+```python
+from openai import AsyncOpenAI, DefaultHttpxClient
+from outlines import models
+from outlines.models.openai import OpenAIConfig
 
 
-## Advanced usage
+client = AsyncOpenAI(
+    base_url="http://my.test.server.example.com:8083",
+    http_client=DefaultHttpxClient(
+        proxies="http://my.test.proxy.example.com",
+        transport=httpx.HTTPTransport(local_address="0.0.0.0"),
+    ),
+)
+config = OpenAIConfig("model_name")
+model = models.openai(client, config)
+```
 
 It is possible to specify the values for `seed`, `presence_penalty`, `frequence_penalty`, `top_p` by passing an instance of `OpenAIConfig` when initializing the model:
 
@@ -71,11 +139,32 @@ It is possible to specify the values for `seed`, `presence_penalty`, `frequence_
 from outlines.models.openai import OpenAIConfig
 from outlines import models
 
+
 config = OpenAIConfig(
     presence_penalty=1.,
-    frequence_penalty=1.,
+    frequency_penalty=1.,
     top_p=.95,
     seed=0,
 )
-model = models.openai("gpt-4", config=config)
+model = models.openai("gpt-3.5-turbo", config)
+```
+
+## Monitoring API use
+
+It is important to be able to track your API usage when working with OpenAI's API. The number of prompt tokens and completion tokens is directly accessible via the model instance:
+
+```python
+from openai import AsyncOpenAI
+import outlines.models
+
+
+model = models.openai("gpt-4")
+
+print(model.prompt_tokens)
+# 0
+
+print(model.completion_tokens)
+# 0
 ```
+
+These numbers are updated every time you call the model.
diff --git a/mkdocs.yml b/mkdocs.yml
index 01e8506ab..ea90b5512 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -121,12 +121,14 @@ nav:
         - Prompt templating: reference/prompting.md
         - Outlines functions: reference/functions.md
     - Models:
-        - vLLM: reference/models/vllm.md
-        - Llama.cpp: reference/models/llamacpp.md
-        - Transformers: reference/models/transformers.md
-        - ExllamaV2: reference/models/exllamav2.md
-        - Mamba: reference/models/mamba.md
-        - OpenAI: reference/models/openai.md
+        - Open source:
+          - Transformers: reference/models/transformers.md
+          - Llama.cpp: reference/models/llamacpp.md
+          - vLLM: reference/models/vllm.md
+          - ExllamaV2: reference/models/exllamav2.md
+          - Mamba: reference/models/mamba.md
+        - API:
+            - OpenAI: reference/models/openai.md
 
   - API Reference:
     - api/index.md