Skip to content

Commit

Permalink
add langchain agent docs (vocodedev#564)
Browse files Browse the repository at this point in the history
  • Loading branch information
adnaans committed Jun 19, 2024
1 parent 479ab54 commit 61e285f
Show file tree
Hide file tree
Showing 4 changed files with 340 additions and 191 deletions.
5 changes: 3 additions & 2 deletions docs/mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -82,13 +82,14 @@
"open-source/sentry",
"open-source/logging-with-loguru",
"open-source/turn-based-conversation",
"open-source/language-support"
"open-source/language-support",
"open-source/langchain-agent"
]
},
{
"group": "Legacy (0.0.111) Guides",
"pages": [
"open-source/langchain-agent",
"open-source/langchain-agent-dep",
"open-source/local-conversation"
]
},
Expand Down
209 changes: 209 additions & 0 deletions docs/open-source/langchain-agent-dep.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
---
title: "(Deprecated) Langchain Agent"
description: "Empower Langchain agents to interact with the real world via phone calls"
---

# Introduction

This example shows how to use Vocode as a tool augmenting the abilities of a Langchain
agent. By providing it with access to Vocode, a [Langchain Agent](https://python.langchain.com/en/latest/modules/agents.html)
can now make autonomous phone calls and take action based on the outcome of the calls.

Our demo will walk through how to instruct the agent to lookup a phone number and make the
appropriate call.

<div>
<iframe
style={{
width: "100%",
height: "300px",
paddingLeft: "75px",
paddingRight: "75px",
}}
src="https://www.loom.com/embed/ba80f55524944ccc8c3b437b7a909270"
></iframe>
</div>
<br />

# How to run it

## Requirements

1. Install [Ngrok](https://ngrok.com/)
2. Install [Redis](https://redis.com/)
3. Install [Poetry](https://python-poetry.org/)

## Run the example

Note: `gpt-4` is required for this to work. `gpt-3.5-turbo` or older models are not smart enough to parse JSON responses in Langchain agents reliably out-of-the-box.

To get started, clone the Vocode repo or copy the [Langchain agent app](https://github.com/vocodedev/vocode-python/tree/main/apps/langchain_agent) directory.

```bash
git clone https://github.com/vocodedev/vocode-python.git
```

### Environment

1. Copy the `.env.template` and fill in your API keys. You'll need:

- [Deepgram](https://deepgram.com) (for speech transcription)
- [OpenAI](https://platform.openai.com) (for the underlying agent)
- [Azure](https://azure.microsoft.com/en-us/products/cognitive-services/text-to-speech/) (for speech synthesis)
- [Twilio](https://console.twilio.com/) (for telephony)

2. Tunnel port 3000 to ngrok by running:

```
ngrok http 3000
```

Fill in the `TELEPHONY_SERVER_BASE_URL` environment variable with your ngrok base URL: don't include `https://` so should be something like:

```
TELEPHONY_SERVER_BASE_URL=asdf1234.ngrok.app
```

3. Buy a phone number on Twilio or [verify your caller ID](https://support.twilio.com/hc/en-us/articles/223180048-Adding-a-Verified-Phone-Number-or-Caller-ID-with-Twilio) to use as the outbound phone number.
Set this phone number as the `OUTBOUND_CALLER_NUMBER` environment variable. Include `+` and the area code, so for a US phone number, it would look something like.

```
OUTBOUND_CALLER_NUMBER=+15555555555
```

### Set up self-hosted telephony server

Run the following setups from the `langchain_agent` directory.

#### Running with Docker

1. Build the telephony server Docker image

```
docker build -t vocode-langchain-agent-telephony-app .
```

2. Run the service using `docker-compose`

```
docker-compose up
```

#### Running with Python

1. (optional) Set up a Python environment: we recommend `virtualenv`

```
python3 -m venv venv
source venv/bin/activate
```

2. Install requirements

```bash
poetry install
```

3. Run an instance of Redis at http://localhost:6379. With Docker, this can be done with:

```
docker run -dp 6379:6379 -it redis/redis-stack:latest
```

4. Run the `TelephonyServer`:

```bash
uvicorn telephony_app:app --reload --port 3000
```

### Set up the Langchain agent

With the self-hosted telephony server running:

1. Update the phone numbers in the contact book in `tools/contacts.py`

```python
CONTACTS = [{"name": "Kian", "phone": "+123456789"}]
```

2. Run `main.py`

```bash
poetry install
poetry run python main.py
```

# Code explanation
The Langchain agent is implemented in `main.py`. It uses the Langchain library to initialize an agent that can have a conversation.

## Langchain Agent
`main.py` instantiates the langchain agent and relevant tools. It sets an objective, initializes a Langchain agent, and runs the conversation.
```python
agent = initialize_agent(
tools=[get_all_contacts, call_phone_number, word_of_the_day],
llm=llm,
agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
verbose=verbose,
memory=memory,
)
```

## Langchain tools

### Tool to get all contacts
```python
@tool("get_all_contacts")
def get_all_contacts(placeholder: str) -> List[dict]:
"""Get contacts."""
return CONTACTS
```

### Tool to call phone number
`tools/vocode.py` makes use of the `OutboundCall` class to initiate a phone call
```python
@tool("call phone number")
def call_phone_number(input: str) -> str:
"""calls a phone number as a bot and returns a transcript of the conversation.
the input to this tool is a pipe separated list of a phone number, a prompt, and the first thing the bot should say.
The prompt should instruct the bot with what to do on the call and be in the 3rd person,
like 'the assistant is performing this task' instead of 'perform this task'.
should only use this tool once it has found an adequate phone number to call.
for example, `+15555555555|the assistant is explaining the meaning of life|i'm going to tell you the meaning of life` will call +15555555555, say 'i'm going to tell you the meaning of life', and instruct the assistant to tell the human what the meaning of life is.
"""
phone_number, prompt, initial_message = input.split("|", 2)
call = OutboundCall(
base_url=os.environ["TELEPHONY_SERVER_BASE_URL"],
to_phone=phone_number,
from_phone=os.environ["OUTBOUND_CALLER_NUMBER"],
config_manager=RedisConfigManager(),
agent_config=ChatGPTAgentConfig(
prompt_preamble=prompt,
initial_message=BaseMessage(text=initial_message),
),
logger=logging.Logger("call_phone_number"),
)
LOOP.run_until_complete(call.start())
while True:
maybe_transcript = get_transcript(call.conversation_id)
if maybe_transcript:
delete_transcript(call.conversation_id)
return maybe_transcript
else:
time.sleep(1)
```

## TelephonyServer
`telephony_app.py` instantiates a `TelephonyServer` object to manage the phone call initiated by `OutboundCall`
```python
telephony_server = TelephonyServer(
base_url=BASE_URL,
config_manager=config_manager,
inbound_call_configs=[],
events_manager=EventsManager(),
logger=logger,
)

app.include_router(telephony_server.get_router())
```
Loading

0 comments on commit 61e285f

Please sign in to comment.