Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/katanemo/arch into cotran/h…
Browse files Browse the repository at this point in the history
…allu-fix
  • Loading branch information
cotran2 committed Oct 15, 2024
2 parents 5049082 + 35c5e30 commit b8c6bd7
Show file tree
Hide file tree
Showing 43 changed files with 865 additions and 644 deletions.
5 changes: 4 additions & 1 deletion .github/workflows/checks.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
name: Checks

on: pull_request
on:
pull_request:
push:
branches: [main]

jobs:
test:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,4 @@ model_server/venv_model_server
model_server/build
model_server/dist
arch_logs/
dist/
17 changes: 8 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,18 @@
<img src="docs/source/_static/img/arch-logo.png" alt="Arch Gateway Logo" title="Arch Gateway Logo">
</p>

## Build fast, robust, and personalized GenAI apps (agents, assistants, etc.)
## Build fast, robust, and personalized AI agents.

Arch is an intelligent [Layer 7](https://www.cloudflare.com/learning/ddos/what-is-layer-7/) gateway designed for generative AI apps, AI agents, and co-pilots that work with prompts. Engineered with purpose-built LLMs, Arch handles the critical but undifferentiated tasks related to the handling and processing of prompts, including detecting and rejecting [jailbreak](https://github.com/verazuo/jailbreak_llms) attempts, intelligently calling "backend" APIs to fulfill the user's request represented in a prompt, routing to and offering disaster recovery between upstream LLMs, and managing the observability of prompts and LLM interactions in a centralized way.
Arch is an intelligent [Layer 7](https://www.cloudflare.com/learning/ddos/what-is-layer-7/) gateway designed to protect, observe, and personalize LLM applications (agents, assistants, co-pilots) with your APIs.

Engineered with purpose-built LLMs, Arch handles the critical but undifferentiated tasks related to the handling and processing of prompts, including detecting and rejecting [jailbreak](https://github.com/verazuo/jailbreak_llms) attempts, intelligently calling "backend" APIs to fulfill the user's request represented in a prompt, routing to and offering disaster recovery between upstream LLMs, and managing the observability of prompts and LLM interactions in a centralized way.

Arch is built on (and by the core contributors of) [Envoy Proxy](https://www.envoyproxy.io/) with the belief that:

>Prompts are nuanced and opaque user requests, which require the same capabilities as traditional HTTP requests including secure handling, intelligent routing, robust observability, and integration with backend (API) systems for personalization – all outside business logic.*
**Core Features**:
- Built on [Envoy](https://envoyproxy.io): Arch runs alongside application servers, and builds on top of Envoy's proven HTTP management and scalability features to handle ingress and egress traffic related to prompts and LLMs
- Built on [Envoy](https://envoyproxy.io): Arch runs alongside application servers, and builds on top of Envoy's proven HTTP management and scalability features to handle ingress and egress traffic related to prompts and LLMs.
- Function Calling for fast Agentic and RAG apps. Engineered with purpose-built [LLMs](https://huggingface.co/collections/katanemo/arch-function-66f209a693ea8df14317ad68) to handle fast, cost-effective, and accurate prompt-based tasks like function/API calling, and parameter extraction from prompts.
- Prompt [Guard](https://huggingface.co/collections/katanemo/arch-guard-6702bdc08b889e4bce8f446d): Arch centralizes prompt guardrails to prevent jailbreak attempts and ensure safe user interactions without writing a single line of code.
- Traffic Management: Arch manages LLM calls, offering smart retries, automatic cutover, and resilient upstream connections for continuous availability.
Expand All @@ -20,7 +22,7 @@ Arch is an intelligent [Layer 7](https://www.cloudflare.com/learning/ddos/what-i
**Jump to our [docs](https://docs.archgw.com)** to learn how you can use Arch to improve the speed, security and personalization of your GenAI apps.

## Contact
To get in touch with us, please join our [discord server](https://discord.gg/rbjqVbpa). We will be monitoring that actively and offering support there.
To get in touch with us, please join our [discord server](https://discord.gg/rSRQ9fv7). We will be monitoring that actively and offering support there.

## Demos
* [Function Calling](demos/function_calling/README.md) - Walk through of critical function calling capabilities
Expand All @@ -35,7 +37,7 @@ Follow this guide to learn how to quickly set up Arch and integrate it into your

Before you begin, ensure you have the following:

- `Docker` & `Python` verion 3.10 installed on your system
- `Docker` & `Python` installed on your system
- `API Keys` for LLM providers (if using external LLMs)

### Step 1: Install Arch
Expand Down Expand Up @@ -109,15 +111,12 @@ Make outbound calls via Arch
import openai

# Set the OpenAI API base URL to the Arch gateway endpoint
openai.api_base = "http://127.0.0.1:12000/"
openai.api_base = "http://127.0.0.1:51001/v1"

# No need to set openai.api_key since it's configured in Arch's gateway

# Use the OpenAI client as usual
# we set api_key to '--' becasue openai client would fail to initiate request without it. Just pass any
# dummy value here since arch gateway will properly pass access key before making outbound call.
response = openai.Completion.create(
api_key="--",
model="text-davinci-003",
prompt="What is the capital of France?"
)
Expand Down
2 changes: 2 additions & 0 deletions arch/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,5 @@ services:
- ~/archgw_logs:/var/log/
env_file:
- stage.env
extra_hosts:
- "host.docker.internal:host-gateway"
13 changes: 11 additions & 2 deletions arch/tools/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,18 @@ sh build_cli.sh
archgw build
```

## Step 5: start model server in the background
### Step 5: download models
This will help download models so model_server can load faster. This should be done once.

```bash
archgw download-models
```
archgw up --services model_server

### Logs
`archgw` command can also view logs from gateway and model_server. Use following command to view logs,

```bash
archgw logs --follow
```

## Uninstall Instructions: archgw CLI
Expand Down
1 change: 1 addition & 0 deletions arch/tools/cli/consts.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
KATANEMO_DOCKERHUB_REPO = "katanemo/archgw"
67 changes: 52 additions & 15 deletions arch/tools/cli/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,37 @@
import pkg_resources
import select
from cli.utils import run_docker_compose_ps, print_service_status, check_services_state
from cli.utils import getLogger
import sys

log = getLogger(__name__)


def stream_gateway_logs(follow):
"""
Stream logs from the arch gateway service.
"""
compose_file = pkg_resources.resource_filename(
__name__, "../config/docker-compose.yaml"
)

log.info("Logs from arch gateway service.")

options = ["docker", "compose", "-p", "arch", "logs"]
if follow:
options.append("-f")
try:
# Run `docker-compose logs` to stream logs from the gateway service
subprocess.run(
options,
cwd=os.path.dirname(compose_file),
check=True,
stdout=sys.stdout,
stderr=sys.stderr,
)

except subprocess.CalledProcessError as e:
log.info(f"Failed to stream logs: {str(e)}")


def start_arch(arch_config_file, env, log_timeout=120):
Expand All @@ -14,7 +45,7 @@ def start_arch(arch_config_file, env, log_timeout=120):
path (str): The path where the prompt_confi.yml file is located.
log_timeout (int): Time in seconds to show logs before checking for healthy state.
"""

log.info("Starting arch gateway")
compose_file = pkg_resources.resource_filename(
__name__, "../config/docker-compose.yaml"
)
Expand All @@ -35,9 +66,10 @@ def start_arch(arch_config_file, env, log_timeout=120):
), # Ensure the Docker command runs in the correct path
env=env, # Pass the modified environment
check=True, # Raise an exception if the command fails
stderr=subprocess.PIPE,
stdout=subprocess.PIPE,
)
print(f"Arch docker-compose started in detached.")
print("Monitoring `docker-compose ps` logs...")
log.info(f"Arch docker-compose started in detached.")

start_time = time.time()
services_status = {}
Expand All @@ -51,14 +83,14 @@ def start_arch(arch_config_file, env, log_timeout=120):

# Check if timeout is reached
if elapsed_time > log_timeout:
print(f"Stopping log monitoring after {log_timeout} seconds.")
log.info(f"Stopping log monitoring after {log_timeout} seconds.")
break

current_services_status = run_docker_compose_ps(
compose_file=compose_file, env=env
)
if not current_services_status:
print(
log.info(
"Status for the services could not be detected. Something went wrong. Please run docker logs"
)
break
Expand All @@ -74,11 +106,11 @@ def start_arch(arch_config_file, env, log_timeout=120):
running_states = ["running", "up"]

if check_services_state(current_services_status, running_states):
print("Arch is up and running!")
log.info("Arch gateway is up and running!")
break

if check_services_state(current_services_status, unhealthy_states):
print(
log.info(
"One or more Arch services are unhealthy. Please run `docker logs` for more information"
)
print_service_status(
Expand All @@ -92,7 +124,7 @@ def start_arch(arch_config_file, env, log_timeout=120):
services_status[service_name]["State"]
!= current_services_status[service_name]["State"]
):
print(
log.info(
"One or more Arch services have changed state. Printing current state"
)
print_service_status(current_services_status)
Expand All @@ -101,7 +133,7 @@ def start_arch(arch_config_file, env, log_timeout=120):
services_status = current_services_status

except subprocess.CalledProcessError as e:
print(f"Failed to start Arch: {str(e)}")
log.info(f"Failed to start Arch: {str(e)}")


def stop_arch():
Expand All @@ -115,17 +147,21 @@ def stop_arch():
__name__, "../config/docker-compose.yaml"
)

log.info("Shutting down arch gateway service.")

try:
# Run `docker-compose down` to shut down all services
subprocess.run(
["docker", "compose", "-p", "arch", "down"],
cwd=os.path.dirname(compose_file),
check=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
print("Successfully shut down all services.")
log.info("Successfully shut down arch gateway service.")

except subprocess.CalledProcessError as e:
print(f"Failed to shut down services: {str(e)}")
log.info(f"Failed to shut down services: {str(e)}")


def start_arch_modelserver():
Expand All @@ -134,12 +170,13 @@ def start_arch_modelserver():
"""
try:
log.info("archgw_modelserver restart")
subprocess.run(
["archgw_modelserver", "restart"], check=True, start_new_session=True
)
print("Successfull run the archgw model_server")
log.info("Successfull ran model_server")
except subprocess.CalledProcessError as e:
print(f"Failed to start model_server. Please check archgw_modelserver logs")
log.info(f"Failed to start model_server. Please check archgw_modelserver logs")
sys.exit(1)


Expand All @@ -153,7 +190,7 @@ def stop_arch_modelserver():
["archgw_modelserver", "stop"],
check=True,
)
print("Successfull stopped the archgw model_server")
log.info("Successfull stopped the archgw model_server")
except subprocess.CalledProcessError as e:
print(f"Failed to start model_server. Please check archgw_modelserver logs")
log.info(f"Failed to start model_server. Please check archgw_modelserver logs")
sys.exit(1)
Loading

0 comments on commit b8c6bd7

Please sign in to comment.