Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the documentation for OpenAI models #951

Merged
merged 2 commits into from
Jun 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions docs/reference/models/mlxlm.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ model = models.mlxlm("mlx-community/mlx-community/Meta-Llama-3-8B-Instruct-8bit"

With the loaded model, you can generate text or perform structured generation, e.g.

```python3
```python
from outlines import models, generate

model = models.mlxlm("mlx-community/Meta-Llama-3-8B-Instruct-8bit")
Expand All @@ -28,5 +28,3 @@ model_output = generator("What's Jennys Number?\n")
print(model_output)
# '8675309'
```

For more examples, see the [cookbook](cookbook/index.md).
147 changes: 118 additions & 29 deletions docs/reference/models/openai.md
Original file line number Diff line number Diff line change
@@ -1,81 +1,170 @@
# Generate text with the OpenAI and compatible APIs
# OpenAI and compatible APIs

!!! Installation

You need to install the `openai` and `tiktoken` libraries to be able to use the OpenAI API in Outlines.

Outlines supports models available via the OpenAI Chat API, e.g. ChatGPT and GPT-4. The following models can be used with Outlines:
## OpenAI models

Outlines supports models available via the OpenAI Chat API, e.g. ChatGPT and GPT-4. You can initialize the model by passing the model name to `outlines.models.openai`:

```python
from outlines import models


model = models.openai("gpt-3.5-turbo")
model = models.openai("gpt-4")
model = models.openai("gpt-4-turbo")
model = models.openai("gpt-4o")
```

Check the [OpenAI documentation](https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4) for an up-to-date list of available models. You can pass any parameter you would pass to `openai.AsyncOpenAI` as keyword arguments:

```python
import os
from outlines import models


print(type(model))
# OpenAI
model = models.openai(
"gpt-3.5-turbo",
api_key=os.environ("OPENAI_API_KEY")
)
```

Outlines also supports Azure OpenAI models:
The following table enumerates the possible parameters. Refer to the [OpenAI SDK's code](https://github.com/openai/openai-python/blob/54a5911f5215148a0bdeb10e2bcfb84f635a75b9/src/openai/_client.py) for an up-to-date list.

**Parameters:**

| **Parameters** | **Type** | **Description** | **Default** |
|----------------|:---------|:----------------|:------------|
| `api_key` | `str` | OpenAI API key. Infered from `OPENAI_API_KEY` if not specified | `None` |
| `organization` | `str` | OpenAI organization id. Infered from `OPENAI_ORG_ID` if not specified | `None` |
| `project` | `str` | OpenAI project id. Infered from `OPENAI_PROJECT_ID` if not specified.| `None` |
| `base_url` | `str | https.URL` | Base URL for the endpoint. Infered from `OPENAI_BASE_URL` if no specified. | `None` |
| `timeout` | `float` | Request timeout.| `NOT_GIVEN` |
| `max_retries` | `int` | Maximum number of retries for failing requests | `2` |
| `default_headers` | `Mapping[str, str]` | Default HTTP headers | `None` |
| `default_query` | `Mapping[str, str]` | Custom parameters added to the HTTP queries | `None` |
| `http_client` | `https.AsyncClient` | User-specified `httpx` client | `None` |

## Azure OpenAI models

Outlines also supports Azure OpenAI models:

```python
from outlines import models


model = models.azure_openai(
"azure-deployment-name",
"gpt-3.5-turbo",
api_version="2023-07-01-preview",
azure_endpoint="https://example-endpoint.openai.azure.com",
)
```

More generally, you can use any API client compatible with the OpenAI interface by passing an instance of the client, a configuration, and optionally the corresponding tokenizer (if you want to be able to use `outlines.generate.choice`):
!!! Question "Why do I need to specify model and deployment name?"

```python
from openai import AsyncOpenAI
import tiktoken
The model name is needed to load the correct tokenizer for the model. The tokenizer is necessary for structured generation.

from outlines.models.openai import OpenAI, OpenAIConfig

config = OpenAIConfig(model="gpt-4")
client = AsyncOpenAI()
tokenizer = tiktoken.encoding_for_model("gpt-4")
You can pass any parameter you would pass to `openai.AsyncAzureOpenAI`. You can consult the [OpenAI SDK's code](https://github.com/openai/openai-python/blob/54a5911f5215148a0bdeb10e2bcfb84f635a75b9/src/openai/lib/azure.py) for an up-to-date list.

model = OpenAI(client, config, tokenizer)
```
**Parameters:**


## Monitoring API use
| **Parameters** | **Type** | **Description** | **Default** |
|----------------|:---------|:----------------|:------------|
| `azure_endpoint` | `str` | Azure endpoint, including the resource. Infered from `AZURE_OPENAI_ENDPOINT` if not specified | `None` |
| `api_version` | `str` | API version. Infered from `AZURE_OPENAI_API_KEY` if not specified | `None` |
| `api_key` | `str` | OpenAI API key. Infered from `OPENAI_API_KEY` if not specified | `None` |
| `azure_ad_token` | `str` | Azure active directory token. Inference from `AZURE_OPENAI_AD_TOKEN` if not specified | `None` |
| `azure_ad_token_provider` | `AzureADTokenProvider` | A function that returns an Azure Active Directory token | `None` |
| `organization` | `str` | OpenAI organization id. Infered from `OPENAI_ORG_ID` if not specified | `None` |
| `project` | `str` | OpenAI project id. Infered from `OPENAI_PROJECT_ID` if not specified.| `None` |
| `base_url` | `str | https.URL` | Base URL for the endpoint. Infered from `OPENAI_BASE_URL` if not specified. | `None` |
| `timeout` | `float` | Request timeout.| `NOT_GIVEN` |
| `max_retries` | `int` | Maximum number of retries for failing requests | `2` |
| `default_headers` | `Mapping[str, str]` | Default HTTP headers | `None` |
| `default_query` | `Mapping[str, str]` | Custom parameters added to the HTTP queries | `None` |
| `http_client` | `https.AsyncClient` | User-specified `httpx` client | `None` |

It is important to be able to track your API usage when working with OpenAI's API. The number of prompt tokens and completion tokens is directly accessible via the model instance:
## Models that follow the OpenAI standard

```python
import outlines.models
Outlines supports models that follow the OpenAI standard. You will need to initialize the OpenAI client properly configured and pass it to `outlines.models.openai`

model = models.openai("gpt-4")
```python
import os
from openai import AsyncOpenAI
from outlines import models
from outlines.models.openai import OpenAIConfig

print(model.prompt_tokens)
# 0

print(model.completion_tokens)
# 0
client = AsyncOpenAI(
api_key=os.environ.get("PROVIDER_KEY"),
base_url="http://other.provider.server.com"
)
config = OpenAIConfig("model_name")
model = models.openai(client, config)
```

These numbers are updated every time you call the model.
!!! Warning

You need to pass the async client to be able to do batch inference.

## Advanced configuration

For more advanced configuration option, such as support proxy, please consult the [OpenAI SDK's documentation](https://github.com/openai/openai-python):


```python
from openai import AsyncOpenAI, DefaultHttpxClient
from outlines import models
from outlines.models.openai import OpenAIConfig


## Advanced usage
client = AsyncOpenAI(
base_url="http://my.test.server.example.com:8083",
http_client=DefaultHttpxClient(
proxies="http://my.test.proxy.example.com",
transport=httpx.HTTPTransport(local_address="0.0.0.0"),
),
)
config = OpenAIConfig("model_name")
model = models.openai(client, config)
```

It is possible to specify the values for `seed`, `presence_penalty`, `frequence_penalty`, `top_p` by passing an instance of `OpenAIConfig` when initializing the model:

```python
from outlines.models.openai import OpenAIConfig
from outlines import models


config = OpenAIConfig(
presence_penalty=1.,
frequence_penalty=1.,
frequency_penalty=1.,
top_p=.95,
seed=0,
)
model = models.openai("gpt-4", config=config)
model = models.openai("gpt-3.5-turbo", config)
```

## Monitoring API use

It is important to be able to track your API usage when working with OpenAI's API. The number of prompt tokens and completion tokens is directly accessible via the model instance:

```python
from openai import AsyncOpenAI
import outlines.models


model = models.openai("gpt-4")

print(model.prompt_tokens)
# 0

print(model.completion_tokens)
# 0
```

These numbers are updated every time you call the model.
19 changes: 10 additions & 9 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -122,15 +122,16 @@ nav:
- Prompt templating: reference/prompting.md
- Outlines functions: reference/functions.md
- Models:
- vLLM: reference/models/vllm.md
- Llama.cpp: reference/models/llamacpp.md
- Transformers: reference/models/transformers.md
- MLX: reference/models/mlxlm.md
- ExllamaV2: reference/models/exllamav2.md
- Mamba: reference/models/mamba.md
- OpenAI: reference/models/openai.md
- TGI: reference/models/tgi.md

- Open source:
- Transformers: reference/models/transformers.md
- Llama.cpp: reference/models/llamacpp.md
- vLLM: reference/models/vllm.md
- TGI: reference/models/tgi.md
- ExllamaV2: reference/models/exllamav2.md
- MLX: reference/models/mlxlm.md
- Mamba: reference/models/mamba.md
- API:
- OpenAI: reference/models/openai.md
- API Reference:
- api/index.md
- api/models.md
Expand Down
20 changes: 11 additions & 9 deletions outlines/models/openai.py
Original file line number Diff line number Diff line change
Expand Up @@ -414,10 +414,16 @@ def call(*args, **kwargs):
return call


def openai(
@functools.singledispatch
def openai(model_or_client, *args, **kwargs):
return OpenAI(model_or_client, *args, **kwargs)


@openai.register(str)
def openai_model(
model_name: str,
api_key: Optional[str] = None,
config: Optional[OpenAIConfig] = None,
**openai_client_params,
):
try:
import tiktoken
Expand All @@ -432,7 +438,7 @@ def openai(
else:
config = OpenAIConfig(model=model_name)

client = AsyncOpenAI(api_key=api_key)
client = AsyncOpenAI(**openai_client_params)
tokenizer = tiktoken.encoding_for_model(model_name)

return OpenAI(client, config, tokenizer)
Expand All @@ -441,10 +447,8 @@ def openai(
def azure_openai(
deployment_name: str,
model_name: Optional[str] = None,
azure_endpoint: Optional[str] = None,
api_version: Optional[str] = None,
api_key: Optional[str] = None,
config: Optional[OpenAIConfig] = None,
**azure_openai_client_params,
):
try:
import tiktoken
Expand All @@ -459,9 +463,7 @@ def azure_openai(
if config is None:
config = OpenAIConfig(model=deployment_name)

client = AsyncAzureOpenAI(
azure_endpoint=azure_endpoint, api_version=api_version, api_key=api_key
)
client = AsyncAzureOpenAI(**azure_openai_client_params)
tokenizer = tiktoken.encoding_for_model(model_name or deployment_name)

return OpenAI(client, config, tokenizer)
Loading