feat: Add ability to configure OpenAI base URL in ChatGPTAgentConfig #577

celmore25 · 2024-06-22T18:18:08Z

Motivation

To allow for extremely low latency interactions over voice with LLMs, Groq currently has the highest throughput models which are publicly available. By allowing for a custom OpenAI API compatible endpoint, users can take advantage of a broader set of hardware and model providers.

Changes

Added base_url parameter to ChatGPTAgentConfig to allow customization of the OpenAI API base URL.
Updated instantiate_openai_client function to use the base_url parameter from the configuration.
Modified ChatGPTAgent to utilize the updated instantiate_openai_client function.
Modified the get_tokenizer_info function to allow for llama model usage with custom base URLs. This estimates the token usage and is not exact. Future feature expansion would be needed here to allow more models with certain token counting.
Added tests to verify the new base_url functionality in tests/streaming/agent/test_base_agent.py.

This enhancement allows users to specify a custom OpenAI API base URL, providing greater flexibility in agent configuration.

Usage

Example end to end example:

OPENAI_MODEL_NAME="llama3-70b-8192"
OPENAI_BASE_URL="https://api.groq.com/openai/v1"
OPENAI_API_KEY="<key here>"
ELEVENLABS_API_KEY="<key here>"
ELEVENLABS_VOICE_ID="<ID here>"
DEEPGRAM_API_KEY="<key here>"

import asyncio
import signal
import os

from pydantic_settings import BaseSettings, SettingsConfigDict

from vocode.helpers import create_streaming_microphone_input_and_speaker_output
from vocode.logging import configure_pretty_logging
from vocode.streaming.agent.chat_gpt_agent import ChatGPTAgent
from vocode.streaming.models.agent import ChatGPTAgentConfig
from vocode.streaming.models.message import BaseMessage
from vocode.streaming.models.synthesizer import ElevenLabsSynthesizerConfig
from vocode.streaming.models.transcriber import (
    DeepgramTranscriberConfig,
    PunctuationEndpointingConfig,
)
from vocode.streaming.streaming_conversation import StreamingConversation
from vocode.streaming.synthesizer.eleven_labs_synthesizer import ElevenLabsSynthesizer
from vocode.streaming.transcriber.deepgram_transcriber import DeepgramTranscriber
from vocode.streaming.models.audio import AudioEncoding

configure_pretty_logging()

class Settings(BaseSettings):
    """
    Settings for the streaming conversation quickstart.
    These parameters can be configured with environment variables.
    """

    openai_api_key: str = os.environ.get('OPENAI_API_KEY', '')
    openai_model_name: str = os.environ.get('OPENAI_MODEL_NAME', '')
    openai_base_url: str = os.environ.get('OPENAI_BASE_URL', '')
    elevenlabs_api_key: str = os.environ.get('ELEVENLABS_API_KEY', '')
    elevenlabs_voice_id: str = os.environ.get('ELEVENLABS_VOICE_ID', '')
    deepgram_api_key: str = os.environ.get('DEEPGRAM_API_KEY', '')

    # This means a .env file can be used to overload these settings
    # ex: "OPENAI_API_KEY=my_key" will set openai_api_key over the default above
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        extra="ignore"
    )


settings = Settings()


async def main():
    (
        microphone_input,
        speaker_output,
    ) = create_streaming_microphone_input_and_speaker_output(
        use_default_devices=False,
        speaker_sampling_rate=16000,
    )

    conversation = StreamingConversation(
        output_device=speaker_output,
        transcriber=DeepgramTranscriber(
            DeepgramTranscriberConfig.from_input_device(
                microphone_input,
                endpointing_config=PunctuationEndpointingConfig(),
                api_key=settings.deepgram_api_key,
            ),
        ),
        agent=ChatGPTAgent(
            ChatGPTAgentConfig(
                openai_api_key=settings.openai_api_key,
                model_name=settings.openai_model_name,
                base_url_override=settings.openai_base_url,
                initial_message=BaseMessage(text="What up"),
                prompt_preamble="""The AI is having a pleasant conversation about life""",
            )
        ),
        synthesizer=ElevenLabsSynthesizer(
            ElevenLabsSynthesizerConfig(
                api_key=settings.elevenlabs_api_key,
                voice_id=settings.elevenlabs_voice_id,
                sampling_rate=16000,
                audio_encoding=AudioEncoding.LINEAR16
            )
        ),
    )
    await conversation.start()
    print("Conversation started, press Ctrl+C to end")
    signal.signal(signal.SIGINT, lambda _0, _1: asyncio.create_task(conversation.terminate()))
    while conversation.is_active():
        chunk = await microphone_input.get_audio()
        conversation.receive_audio(chunk)


if __name__ == "__main__":
    asyncio.run(main())

- Added `base_url` parameter to `ChatGPTAgentConfig` to allow customization of the OpenAI API base URL. - Updated `instantiate_openai_client` function to use the `base_url` parameter from the configuration. - Modified `ChatGPTAgent` to utilize the updated `instantiate_openai_client` function. - Added tests to verify the new `base_url` functionality in `tests/streaming/agent/test_base_agent.py`. This enhancement allows users to specify a custom OpenAI API base URL, providing greater flexibility in agent configuration.

…timation for llama

ajar98

@celmore25 this is an awesome change! should be simple to get in my suggestion and then let's get this in

ajar98 · 2024-06-26T18:50:48Z

vocode/streaming/models/agent.py

@@ -115,6 +115,7 @@ class ChatGPTAgentConfig(AgentConfig, type=AgentType.CHAT_GPT.value):  # type: i
    openai_api_key: Optional[str] = None
    prompt_preamble: str
    model_name: str = CHAT_GPT_AGENT_DEFAULT_MODEL_NAME
+    base_url: Optional[str] = None


let's either name this base_url_override or default it to "https://api.openai.com/v1" — i'd prefer the former since it changes the code less

Hi @ajar98 nice to meet you! Thanks for looking over this.

I just pushed some changes to go to the override option. Let me know if anything else needs to get changed!

vocode/streaming/agent/chat_gpt_agent.py

tests/streaming/agent/test_base_agent.py

ajar98 · 2024-07-03T22:10:04Z

thanks @celmore25 ! Just had to fix the test

BTW - we have inbuilt support for groq - see

vocode-core/vocode/streaming/agent/groq_agent.py

Line 29 in f1d688b

class GroqAgent(RespondAgent[GroqAgentConfig]):

still think it's quite useful to support any vLLM compatible API with this change!

arpagon · 2024-07-03T22:26:42Z

Great, this will allow you to use Ollama, for example.

* [DOW-118] set up code linting and tests (#589) * adds github workflow * run black * run isort * adds precommit * adds vscode settings * adds pre-commit guidelines (#590) * creates docker image, updates telephony app deps (#601) * [DOW-105] refactor interruptions into the output device (#586) * [DOW-105] refactor interruptions into the output device (#562) * initial refactor works * remove notion of UtteranceAudioChunk and put all of the state in the callback * move per_chunk_allowance_seconds into output device * onboard onto vonage * rename to abstract output device and onboard other output devices * initial work to onboard twilio output device * twilio conversation works * some cleanup with better comments * unset poetry.lock * move abstract play method into ratelimitoutputdevice + dispatch to thread in fileoutputdevice * rename back to AsyncWorker * comments * work through a bit of mypy * asyncio.gather is g2g: * create interrupt lock * remove todo * remove last todo * remove log for interrupts * fmt * fix mypy * fix mypy * isort * creates first test and adds scaffolding * adds two other send_speech_to_output tests * make send_speech_to_output more efficient * adds tests for rate limit interruptions output device * makes some variables private and also makes the chunk id coming back from the mark match the incoming audio chunk * adds twilio output device tests * make typing better for output devices * fix mypy * resolve PR comments * resolve PR comments * [DOW-101] LiveKit integration (#591) * checkpoint * livekit v0 * in progress changes * integrate with worker * fix import * update deps and remove unneeded files * integrate it properly into app * fix interrupts * make transcript publish work * a confounding fix * isort * constants, some cleanup --------- Co-authored-by: Kian <[email protected]> * upgrade to latest cartesia 1.0.3 (#587) * upgrade to latest cartesia 1.0.3 * fixed linting conflict * finish streaming * make cartesia optional --------- Co-authored-by: Ajay Raj <[email protected]> * poetry version prerelease (#602) * feat: Add ability to configure OpenAI base URL in ChatGPTAgentConfig (#577) * feat: Add ability to configure OpenAI base URL in ChatGPTAgentConfig - Added `base_url` parameter to `ChatGPTAgentConfig` to allow customization of the OpenAI API base URL. - Updated `instantiate_openai_client` function to use the `base_url` parameter from the configuration. - Modified `ChatGPTAgent` to utilize the updated `instantiate_openai_client` function. - Added tests to verify the new `base_url` functionality in `tests/streaming/agent/test_base_agent.py`. This enhancement allows users to specify a custom OpenAI API base URL, providing greater flexibility in agent configuration. * adding capability to use the openai compatible endpoint with token estimation for llama * lint fix * changing openai base_url parameter for overall less code changes * missed logging update * Update vocode/streaming/agent/chat_gpt_agent.py * Update tests/streaming/agent/test_base_agent.py * fix test --------- Co-authored-by: Ajay Raj <[email protected]> * Support passthrough of AsyncHTTPTransport (#603) Support passthrough of AsyncHTTPTransport object * add script used to make PR * adds test target for vocodehq-public * Remove catch-all exception logger for asyncio tasks (#605) * remove error log from exception for asyncio tasks * remove log error on chatgpt query --------- Co-authored-by: Kian <[email protected]> Co-authored-by: rjheeta <[email protected]> Co-authored-by: Clay Elmore <[email protected]> Co-authored-by: vocode-petern <[email protected]> Co-authored-by: Adnaan Sachidanandan <[email protected]>

celmore25 added 3 commits June 22, 2024 10:21

adding capability to use the openai compatible endpoint with token es…

9484c0d

…timation for llama

lint fix

5419c1d

ajar98 requested changes Jun 26, 2024

View reviewed changes

celmore25 added 2 commits June 26, 2024 15:43

changing openai base_url parameter for overall less code changes

7d934c6

missed logging update

fc53112

ajar98 requested a review from adnaans June 27, 2024 23:23

ajar98 reviewed Jul 3, 2024

View reviewed changes

vocode/streaming/agent/chat_gpt_agent.py Outdated Show resolved Hide resolved

Update vocode/streaming/agent/chat_gpt_agent.py

fa1e455

ajar98 reviewed Jul 3, 2024

View reviewed changes

tests/streaming/agent/test_base_agent.py Outdated Show resolved Hide resolved

ajar98 added 2 commits July 3, 2024 15:00

Update tests/streaming/agent/test_base_agent.py

5d34c3c

fix test

9dce21b

ajar98 approved these changes Jul 3, 2024

View reviewed changes

ajar98 merged commit 918412c into vocodedev:main Jul 3, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add ability to configure OpenAI base URL in ChatGPTAgentConfig #577

feat: Add ability to configure OpenAI base URL in ChatGPTAgentConfig #577

celmore25 commented Jun 22, 2024 •

edited

Loading

ajar98 left a comment

ajar98 Jun 26, 2024

celmore25 Jun 26, 2024

ajar98 commented Jul 3, 2024 •

edited

Loading

arpagon commented Jul 3, 2024

feat: Add ability to configure OpenAI base URL in ChatGPTAgentConfig #577

feat: Add ability to configure OpenAI base URL in ChatGPTAgentConfig #577

Conversation

celmore25 commented Jun 22, 2024 • edited Loading

Motivation

Changes

Usage

ajar98 left a comment

Choose a reason for hiding this comment

ajar98 Jun 26, 2024

Choose a reason for hiding this comment

celmore25 Jun 26, 2024

Choose a reason for hiding this comment

ajar98 commented Jul 3, 2024 • edited Loading

arpagon commented Jul 3, 2024

celmore25 commented Jun 22, 2024 •

edited

Loading

ajar98 commented Jul 3, 2024 •

edited

Loading