-
Notifications
You must be signed in to change notification settings - Fork 49
Model Configuration
ROSA supports both the OpenAI API and Azure OpenAI for its language model. Users can configure and pass either a ChatOpenAI
or AzureChatOpenAI
instance to the ROSA class.
Note
We're also experimenting with the use of local models like ChatOllama
(see below)
Here's an overview of how to set up and use these LLMs:
To use the standard OpenAI API with the ChatOpenAI
model:
-
Ensure you have your OpenAI API key.
-
Set up your environment variable:
Add the following to your
.env
file or set it in your system environment:OPENAI_API_KEY=your_openai_api_key
-
Create a
ChatOpenAI
instance:import os from dotenv import load_dotenv from langchain_openai import ChatOpenAI load_dotenv() # This loads the variables from .env file openai_llm = ChatOpenAI( model_name="gpt-4", # or your preferred model temperature=0, max_tokens=None, timeout=None, max_retries=2, openai_api_key=os.getenv("OPENAI_API_KEY"), # Using environment variable ) # Pass the LLM to ROSA rosa_instance = ROSA(ros_version=2, llm=openai_llm, ...)
Important
Ensure that you have the necessary environment variables set in your .env
file or system environment. Always handle your API keys and secrets securely.
To use Azure OpenAI, you'll need to create an AzureChatOpenAI
instance with the appropriate configuration. There are two ways to set this up:
-
Using Azure API Management (APIM) with Tenant ID, Client ID, and Client Secret:
Required Environment Variables:
-
APIM_SUBSCRIPTION_KEY
(if required by your APIM setup) AZURE_TENANT_ID
AZURE_CLIENT_ID
AZURE_CLIENT_SECRET
DEPLOYMENT_ID
API_VERSION
API_ENDPOINT
Add these to your
.env
file or set them in your system environment.import os from dotenv import load_dotenv from langchain_openai import AzureChatOpenAI from azure.identity import ClientSecretCredential, get_bearer_token_provider load_dotenv() # Set up Azure authentication credential = ClientSecretCredential( tenant_id=os.getenv("AZURE_TENANT_ID"), client_id=os.getenv("AZURE_CLIENT_ID"), client_secret=os.getenv("AZURE_CLIENT_SECRET"), authority="https://login.microsoftonline.com", ) token_provider = get_bearer_token_provider( credential, "https://cognitiveservices.azure.com/.default" ) # Create AzureChatOpenAI instance azure_llm = AzureChatOpenAI( azure_deployment=os.getenv("DEPLOYMENT_ID"), azure_ad_token_provider=token_provider, openai_api_type="azure_ad", api_version=os.getenv("API_VERSION"), azure_endpoint=os.getenv("API_ENDPOINT"), default_headers={"Ocp-Apim-Subscription-Key": os.getenv("APIM_SUBSCRIPTION_KEY")} if os.getenv("APIM_SUBSCRIPTION_KEY") else {}, ) # Pass the LLM to ROSA rosa = ROSA(ros_version=2, llm=azure_llm, ...)
-
-
Using API Key:
Required Environment Variables:
AZURE_OPENAI_API_KEY
DEPLOYMENT_ID
API_ENDPOINT
Add these to your
.env
file or set them in your system environment.import os from dotenv import load_dotenv from langchain_openai import AzureChatOpenAI load_dotenv() # Create AzureChatOpenAI instance azure_llm = AzureChatOpenAI( azure_deployment=os.getenv("DEPLOYMENT_ID"), openai_api_key=os.getenv("AZURE_OPENAI_API_KEY"), azure_endpoint=os.getenv("API_ENDPOINT"), ) # Pass the LLM to ROSA rosa = ROSA(ros_version=2, llm=azure_llm, ...)
Warning
ChatOllama
usage is experimental.
The latest Llama models support tool calling, which means they can theoretically be used for building agents.
That said, ROSA
was built to be used with state-of-the-art models like gpt-4o
, with large context windows,
high throughput, and relatively low latency. Achieving similar performance with local models would require
a lot of expensive hardware and additional setup and configuration.
We have only experimented with Llama3.1 8b
, and it is severely limited. We do not recommend using Llama3.1 8b
with ROSA. Additionally, the core ROSA agent requires a context window of at least 4k
tokens, so we recommend a minimum context length >= 8192
tokens, regardless of which model you use.
To use Ollama with the ChatOllama
model:
-
Ensure you have Ollama installed and running on your system.
-
Create a
ChatOllama
instance:from langchain_ollama import ChatOllama ollama_llm = ChatOllama( model="llama3.1:70b", # or your preferred model temperature=0, num_ctx=8192, # adjust based on your model's context window ) # Pass the LLM to ROSA rosa_instance = ROSA(ros_version=2, llm=ollama_llm, ...)
You can customize the
ChatOllama
instance with additional parameters:-
base_url
: The base URL of your Ollama instance (default is "http://localhost:11434") -
callback_manager
: A callback manager for handling events -
verbose
: Whether to print verbose output
-
-
If you're running Ollama on a different machine or port, specify the
base_url
:ollama_llm = ChatOllama( model="llama3.1", base_url="http://your-ollama-server:11434", # ... other parameters ... )
Note
Ollama runs locally on your machine, so you don't need to set up API keys or environment variables. However, ensure that you have sufficient system resources to run your chosen model.
For more detailed information on using these models, including advanced features like tool calling, streaming, and fine-tuning, refer to the official documentation:
Important
Remember to handle your API keys and secrets securely, preferably using environment variables or a secure secret management system.
Copyright (c) 2024. Jet Propulsion Laboratory. All rights reserved.