Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(stable nov 21st release) #6863

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
181 changes: 90 additions & 91 deletions docs/my-website/docs/providers/vertex.md
Original file line number Diff line number Diff line change
Expand Up @@ -572,6 +572,96 @@ Here's how to use Vertex AI with the LiteLLM Proxy Server

</Tabs>


## Authentication - vertex_project, vertex_location, etc.

Set your vertex credentials via:
- dynamic params
OR
- env vars


### **Dynamic Params**

You can set:
- `vertex_credentials` (str) - can be a json string or filepath to your vertex ai service account.json
- `vertex_location` (str) - place where vertex model is deployed (us-central1, asia-southeast1, etc.)
- `vertex_project` Optional[str] - use if vertex project different from the one in vertex_credentials

as dynamic params for a `litellm.completion` call.

<Tabs>
<TabItem value="sdk" label="SDK">

```python
from litellm import completion
import json

## GET CREDENTIALS
file_path = 'path/to/vertex_ai_service_account.json'

# Load the JSON file
with open(file_path, 'r') as file:
vertex_credentials = json.load(file)

# Convert to JSON string
vertex_credentials_json = json.dumps(vertex_credentials)


response = completion(
model="vertex_ai/gemini-pro",
messages=[{"content": "You are a good bot.","role": "system"}, {"content": "Hello, how are you?","role": "user"}],
vertex_credentials=vertex_credentials_json,
vertex_project="my-special-project",
vertex_location="my-special-location"
)
```

</TabItem>
<TabItem value="proxy" label="PROXY">

```yaml
model_list:
- model_name: gemini-1.5-pro
litellm_params:
model: gemini-1.5-pro
vertex_credentials: os.environ/VERTEX_FILE_PATH_ENV_VAR # os.environ["VERTEX_FILE_PATH_ENV_VAR"] = "/path/to/service_account.json"
vertex_project: "my-special-project"
vertex_location: "my-special-location:
```

</TabItem>
</Tabs>




### **Environment Variables**

You can set:
- `GOOGLE_APPLICATION_CREDENTIALS` - store the filepath for your service_account.json in here (used by vertex sdk directly).
- VERTEXAI_LOCATION - place where vertex model is deployed (us-central1, asia-southeast1, etc.)
- VERTEXAI_PROJECT - Optional[str] - use if vertex project different from the one in vertex_credentials

1. GOOGLE_APPLICATION_CREDENTIALS

```bash
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service_account.json"
```

2. VERTEXAI_LOCATION

```bash
export VERTEXAI_LOCATION="us-central1" # can be any vertex location
```

3. VERTEXAI_PROJECT

```bash
export VERTEXAI_PROJECT="my-test-project" # ONLY use if model project is different from service account project
```


## Specifying Safety Settings
In certain use-cases you may need to make calls to the models and pass [safety settigns](https://ai.google.dev/docs/safety_setting_gemini) different from the defaults. To do so, simple pass the `safety_settings` argument to `completion` or `acompletion`. For example:

Expand Down Expand Up @@ -2303,97 +2393,6 @@ print("response from proxy", response)
</TabItem>
</Tabs>



## Authentication - vertex_project, vertex_location, etc.

Set your vertex credentials via:
- dynamic params
OR
- env vars


### **Dynamic Params**

You can set:
- `vertex_credentials` (str) - can be a json string or filepath to your vertex ai service account.json
- `vertex_location` (str) - place where vertex model is deployed (us-central1, asia-southeast1, etc.)
- `vertex_project` Optional[str] - use if vertex project different from the one in vertex_credentials

as dynamic params for a `litellm.completion` call.

<Tabs>
<TabItem value="sdk" label="SDK">

```python
from litellm import completion
import json

## GET CREDENTIALS
file_path = 'path/to/vertex_ai_service_account.json'

# Load the JSON file
with open(file_path, 'r') as file:
vertex_credentials = json.load(file)

# Convert to JSON string
vertex_credentials_json = json.dumps(vertex_credentials)


response = completion(
model="vertex_ai/gemini-pro",
messages=[{"content": "You are a good bot.","role": "system"}, {"content": "Hello, how are you?","role": "user"}],
vertex_credentials=vertex_credentials_json,
vertex_project="my-special-project",
vertex_location="my-special-location"
)
```

</TabItem>
<TabItem value="proxy" label="PROXY">

```yaml
model_list:
- model_name: gemini-1.5-pro
litellm_params:
model: gemini-1.5-pro
vertex_credentials: os.environ/VERTEX_FILE_PATH_ENV_VAR # os.environ["VERTEX_FILE_PATH_ENV_VAR"] = "/path/to/service_account.json"
vertex_project: "my-special-project"
vertex_location: "my-special-location:
```

</TabItem>
</Tabs>




### **Environment Variables**

You can set:
- `GOOGLE_APPLICATION_CREDENTIALS` - store the filepath for your service_account.json in here (used by vertex sdk directly).
- VERTEXAI_LOCATION - place where vertex model is deployed (us-central1, asia-southeast1, etc.)
- VERTEXAI_PROJECT - Optional[str] - use if vertex project different from the one in vertex_credentials

1. GOOGLE_APPLICATION_CREDENTIALS

```bash
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service_account.json"
```

2. VERTEXAI_LOCATION

```bash
export VERTEXAI_LOCATION="us-central1" # can be any vertex location
```

3. VERTEXAI_PROJECT

```bash
export VERTEXAI_PROJECT="my-test-project" # ONLY use if model project is different from service account project
```


## Extra

### Using `GOOGLE_APPLICATION_CREDENTIALS`
Expand Down
2 changes: 1 addition & 1 deletion litellm/llms/anthropic/chat/transformation.py
Original file line number Diff line number Diff line change
Expand Up @@ -374,7 +374,7 @@ def _create_json_tool_call_for_response_format(
_input_schema["additionalProperties"] = True
_input_schema["properties"] = {}
else:
_input_schema["properties"] = json_schema
_input_schema["properties"] = {"values": json_schema}

_tool = AnthropicMessagesTool(name="json_tool_call", input_schema=_input_schema)
return _tool
Expand Down
3 changes: 3 additions & 0 deletions litellm/llms/databricks/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -470,6 +470,9 @@ def completion(
optional_params[k] = v

stream: bool = optional_params.get("stream", None) or False
optional_params.pop(
"max_retries", None
) # [TODO] add max retry support at llm api call level
optional_params["stream"] = stream

data = {
Expand Down
2 changes: 2 additions & 0 deletions litellm/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -4729,6 +4729,7 @@ def transcription(
response_format: Optional[
Literal["json", "text", "srt", "verbose_json", "vtt"]
] = None,
timestamp_granularities: Optional[List[Literal["word", "segment"]]] = None,
temperature: Optional[int] = None, # openai defaults this to 0
## LITELLM PARAMS ##
user: Optional[str] = None,
Expand Down Expand Up @@ -4778,6 +4779,7 @@ def transcription(
language=language,
prompt=prompt,
response_format=response_format,
timestamp_granularities=timestamp_granularities,
temperature=temperature,
custom_llm_provider=custom_llm_provider,
drop_params=drop_params,
Expand Down
34 changes: 12 additions & 22 deletions litellm/model_prices_and_context_window_backup.json
Original file line number Diff line number Diff line change
Expand Up @@ -1884,7 +1884,8 @@
"supports_vision": true,
"tool_use_system_prompt_tokens": 264,
"supports_assistant_prefill": true,
"supports_prompt_caching": true
"supports_prompt_caching": true,
"supports_response_schema": true
},
"claude-3-5-haiku-20241022": {
"max_tokens": 8192,
Expand All @@ -1900,7 +1901,8 @@
"tool_use_system_prompt_tokens": 264,
"supports_assistant_prefill": true,
"supports_prompt_caching": true,
"supports_pdf_input": true
"supports_pdf_input": true,
"supports_response_schema": true
},
"claude-3-opus-20240229": {
"max_tokens": 4096,
Expand All @@ -1916,7 +1918,8 @@
"supports_vision": true,
"tool_use_system_prompt_tokens": 395,
"supports_assistant_prefill": true,
"supports_prompt_caching": true
"supports_prompt_caching": true,
"supports_response_schema": true
},
"claude-3-sonnet-20240229": {
"max_tokens": 4096,
Expand All @@ -1930,7 +1933,8 @@
"supports_vision": true,
"tool_use_system_prompt_tokens": 159,
"supports_assistant_prefill": true,
"supports_prompt_caching": true
"supports_prompt_caching": true,
"supports_response_schema": true
},
"claude-3-5-sonnet-20240620": {
"max_tokens": 8192,
Expand All @@ -1946,7 +1950,8 @@
"supports_vision": true,
"tool_use_system_prompt_tokens": 159,
"supports_assistant_prefill": true,
"supports_prompt_caching": true
"supports_prompt_caching": true,
"supports_response_schema": true
},
"claude-3-5-sonnet-20241022": {
"max_tokens": 8192,
Expand All @@ -1962,7 +1967,8 @@
"supports_vision": true,
"tool_use_system_prompt_tokens": 159,
"supports_assistant_prefill": true,
"supports_prompt_caching": true
"supports_prompt_caching": true,
"supports_response_schema": true
},
"text-bison": {
"max_tokens": 2048,
Expand Down Expand Up @@ -3852,22 +3858,6 @@
"supports_function_calling": true,
"tool_use_system_prompt_tokens": 264
},
"anthropic/claude-3-5-sonnet-20241022": {
"max_tokens": 8192,
"max_input_tokens": 200000,
"max_output_tokens": 8192,
"input_cost_per_token": 0.000003,
"output_cost_per_token": 0.000015,
"cache_creation_input_token_cost": 0.00000375,
"cache_read_input_token_cost": 0.0000003,
"litellm_provider": "anthropic",
"mode": "chat",
"supports_function_calling": true,
"supports_vision": true,
"tool_use_system_prompt_tokens": 159,
"supports_assistant_prefill": true,
"supports_prompt_caching": true
},
"openrouter/anthropic/claude-3.5-sonnet": {
"max_tokens": 8192,
"max_input_tokens": 200000,
Expand Down
1 change: 1 addition & 0 deletions litellm/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -2125,6 +2125,7 @@ def get_optional_params_transcription(
prompt: Optional[str] = None,
response_format: Optional[str] = None,
temperature: Optional[int] = None,
timestamp_granularities: Optional[List[Literal["word", "segment"]]] = None,
custom_llm_provider: Optional[str] = None,
drop_params: Optional[bool] = None,
**kwargs,
Expand Down
Loading