add langchain agent docs (vocodedev#564)

rjheeta · Jun 19, 2024 · 61e285f · 61e285f
1 parent 479ab54
commit 61e285f
Show file tree

Hide file tree

Showing 4 changed files with 340 additions and 191 deletions.
diff --git a/docs/mint.json b/docs/mint.json
@@ -82,13 +82,14 @@
         "open-source/sentry",
         "open-source/logging-with-loguru",
         "open-source/turn-based-conversation",
-        "open-source/language-support"
+        "open-source/language-support",
+        "open-source/langchain-agent"
       ]
     },
     {
       "group": "Legacy (0.0.111) Guides",
       "pages": [
-        "open-source/langchain-agent",
+        "open-source/langchain-agent-dep",
         "open-source/local-conversation"
       ]
     },

diff --git a/docs/open-source/langchain-agent-dep.mdx b/docs/open-source/langchain-agent-dep.mdx
@@ -0,0 +1,209 @@
+---
+title: "(Deprecated) Langchain Agent"
+description: "Empower Langchain agents to interact with the real world via phone calls"
+---
+
+# Introduction
+
+This example shows how to use Vocode as a tool augmenting the abilities of a Langchain
+agent. By providing it with access to Vocode, a [Langchain Agent](https://python.langchain.com/en/latest/modules/agents.html)
+can now make autonomous phone calls and take action based on the outcome of the calls.
+
+Our demo will walk through how to instruct the agent to lookup a phone number and make the
+appropriate call.
+
+<div>
+  <iframe
+    style={{
+      width: "100%",
+      height: "300px",
+      paddingLeft: "75px",
+      paddingRight: "75px",
+    }}
+    src="https://www.loom.com/embed/ba80f55524944ccc8c3b437b7a909270"
+  ></iframe>
+</div>
+<br />
+
+# How to run it
+
+## Requirements
+
+1. Install [Ngrok](https://ngrok.com/)
+2. Install [Redis](https://redis.com/)
+3. Install [Poetry](https://python-poetry.org/)
+
+## Run the example
+
+Note: `gpt-4` is required for this to work. `gpt-3.5-turbo` or older models are not smart enough to parse JSON responses in Langchain agents reliably out-of-the-box.
+
+To get started, clone the Vocode repo or copy the [Langchain agent app](https://github.com/vocodedev/vocode-python/tree/main/apps/langchain_agent) directory.
+
+```bash
+git clone https://github.com/vocodedev/vocode-python.git
+```
+
+### Environment
+
+1. Copy the `.env.template` and fill in your API keys. You'll need:
+
+- [Deepgram](https://deepgram.com) (for speech transcription)
+- [OpenAI](https://platform.openai.com) (for the underlying agent)
+- [Azure](https://azure.microsoft.com/en-us/products/cognitive-services/text-to-speech/) (for speech synthesis)
+- [Twilio](https://console.twilio.com/) (for telephony)
+
+2. Tunnel port 3000 to ngrok by running:
+
+```
+ngrok http 3000
+```
+
+Fill in the `TELEPHONY_SERVER_BASE_URL` environment variable with your ngrok base URL: don't include `https://` so should be something like:
+
+```
+TELEPHONY_SERVER_BASE_URL=asdf1234.ngrok.app
+```
+
+3. Buy a phone number on Twilio or [verify your caller ID](https://support.twilio.com/hc/en-us/articles/223180048-Adding-a-Verified-Phone-Number-or-Caller-ID-with-Twilio) to use as the outbound phone number.
+   Set this phone number as the `OUTBOUND_CALLER_NUMBER` environment variable. Include `+` and the area code, so for a US phone number, it would look something like.
+
+```
+OUTBOUND_CALLER_NUMBER=+15555555555
+```
+
+### Set up self-hosted telephony server
+
+Run the following setups from the `langchain_agent` directory.
+
+#### Running with Docker
+
+1. Build the telephony server Docker image
+
+```
+docker build -t vocode-langchain-agent-telephony-app .
+```
+
+2. Run the service using `docker-compose`
+
+```
+docker-compose up
+```
+
+#### Running with Python
+
+1. (optional) Set up a Python environment: we recommend `virtualenv`
+
+```
+python3 -m venv venv
+source venv/bin/activate
+```
+
+2. Install requirements
+
+```bash
+poetry install
+```
+
+3. Run an instance of Redis at http://localhost:6379. With Docker, this can be done with:
+
+```
+docker run -dp 6379:6379 -it redis/redis-stack:latest
+```
+
+4. Run the `TelephonyServer`:
+
+```bash
+uvicorn telephony_app:app --reload --port 3000
+```
+
+### Set up the Langchain agent
+
+With the self-hosted telephony server running:
+
+1. Update the phone numbers in the contact book in `tools/contacts.py`
+
+```python
+CONTACTS = [{"name": "Kian", "phone": "+123456789"}]
+```
+
+2. Run `main.py`
+
+```bash
+poetry install
+poetry run python main.py
+```
+
+# Code explanation
+The Langchain agent is implemented in `main.py`. It uses the Langchain library to initialize an agent that can have a conversation.
+
+## Langchain Agent
+`main.py` instantiates the langchain agent and relevant tools. It sets an objective, initializes a Langchain agent, and runs the conversation.
+```python
+agent = initialize_agent(
+    tools=[get_all_contacts, call_phone_number, word_of_the_day],
+    llm=llm,
+    agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
+    verbose=verbose,
+    memory=memory,
+)
+```
+
+## Langchain tools
+
+### Tool to get all contacts
+```python
+@tool("get_all_contacts")
+def get_all_contacts(placeholder: str) -> List[dict]:
+    """Get contacts."""
+    return CONTACTS
+```
+
+### Tool to call phone number
+`tools/vocode.py` makes use of the `OutboundCall` class to initiate a phone call
+```python
+@tool("call phone number")
+def call_phone_number(input: str) -> str:
+    """calls a phone number as a bot and returns a transcript of the conversation.
+    the input to this tool is a pipe separated list of a phone number, a prompt, and the first thing the bot should say.
+    The prompt should instruct the bot with what to do on the call and be in the 3rd person,
+    like 'the assistant is performing this task' instead of 'perform this task'.
+
+    should only use this tool once it has found an adequate phone number to call.
+
+    for example, `+15555555555|the assistant is explaining the meaning of life|i'm going to tell you the meaning of life` will call +15555555555, say 'i'm going to tell you the meaning of life', and instruct the assistant to tell the human what the meaning of life is.
+    """
+    phone_number, prompt, initial_message = input.split("|", 2)
+    call = OutboundCall(
+        base_url=os.environ["TELEPHONY_SERVER_BASE_URL"],
+        to_phone=phone_number,
+        from_phone=os.environ["OUTBOUND_CALLER_NUMBER"],
+        config_manager=RedisConfigManager(),
+        agent_config=ChatGPTAgentConfig(
+            prompt_preamble=prompt,
+            initial_message=BaseMessage(text=initial_message),
+        ),
+        logger=logging.Logger("call_phone_number"),
+    )
+    LOOP.run_until_complete(call.start())
+    while True:
+        maybe_transcript = get_transcript(call.conversation_id)
+        if maybe_transcript:
+            delete_transcript(call.conversation_id)
+            return maybe_transcript
+        else:
+            time.sleep(1)
+```
+
+## TelephonyServer
+`telephony_app.py` instantiates a `TelephonyServer` object to manage the phone call initiated by `OutboundCall`
+```python
+telephony_server = TelephonyServer(
+    base_url=BASE_URL,
+    config_manager=config_manager,
+    inbound_call_configs=[],
+    events_manager=EventsManager(),
+    logger=logger,
+)
+
+app.include_router(telephony_server.get_router())
+```