diff --git a/README.md b/README.md index e142cc0..428b23d 100644 --- a/README.md +++ b/README.md @@ -80,6 +80,13 @@ An estimated 31% of LLM queries are potentially redundant ([source](https://arxi | [/semantic-cache/doc2cache_llama3_1.ipynb](python-recipes/semantic-cache/doc2cache_llama3_1.ipynb) | Build a semantic cache using the Doc2Cache framework and Llama3.1 | | [/semantic-cache/semantic_caching_gemini.ipynb](python-recipes/semantic-cache/semantic_caching_gemini.ipynb) | Build a semantic cache with Redis and Google Gemini | +### Semantic Routing +Routing is a simple and effective way of preventing misuses with your AI application or for creating branching logic between data sources etc. + +| Recipe | Description | +| --- | --- | +| [/semantic-router/00_semantic_routing.ipynb](python-recipes/semantic-router/00_semantic_routing.ipynb) | Simple examples of how to build an allow/block list router in addition to a multi-topic router | + ### Agents | Recipe | Description | diff --git a/python-recipes/semantic-router/00_semantic_routing.ipynb b/python-recipes/semantic-router/00_semantic_routing.ipynb new file mode 100644 index 0000000..0b6f98e --- /dev/null +++ b/python-recipes/semantic-router/00_semantic_routing.ipynb @@ -0,0 +1,692 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "cbba56a9", + "metadata": {}, + "source": [ + "![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)\n", + "# Semantic Routing\n", + "\n", + "RedisVL provides a `SemanticRouter` interface to utilize Redis' built-in search & aggregation in order to perform\n", + "KNN-style classification over a set of `Route` references to determine the best match.\n", + "\n", + "This notebook will go over how to use Redis as a Semantic Router for your applications.\n", + "\n", + "## Let's Begin!\n", + "\"Open\n" + ] + }, + { + "cell_type": "markdown", + "id": "19bdc2a5-2192-4f5f-bd6e-7c956fd0e230", + "metadata": {}, + "source": [ + "# Setup\n", + "\n", + "## Install Packages" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "c620286e", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Note: you may need to restart the kernel to use updated packages.\n" + ] + } + ], + "source": [ + "%pip install -q redisvl sentence-transformers" + ] + }, + { + "cell_type": "markdown", + "id": "323aec7f", + "metadata": {}, + "source": [ + "## Run a Redis instance\n", + "\n", + "#### For Colab\n", + "Use the shell script below to download, extract, and install [Redis Stack](https://redis.io/docs/getting-started/install-stack/) directly from the Redis package archive." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2cb85a99", + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "%%sh\n", + "curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg\n", + "echo \"deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main\" | sudo tee /etc/apt/sources.list.d/redis.list\n", + "sudo apt-get update > /dev/null 2>&1\n", + "sudo apt-get install redis-stack-server > /dev/null 2>&1\n", + "redis-stack-server --daemonize yes" + ] + }, + { + "cell_type": "markdown", + "id": "7c5dbaaf", + "metadata": {}, + "source": [ + "#### For Alternative Environments\n", + "There are many ways to get the necessary redis-stack instance running\n", + "1. On cloud, deploy a [FREE instance of Redis in the cloud](https://redis.com/try-free/). Or, if you have your\n", + "own version of Redis Enterprise running, that works too!\n", + "2. Per OS, [see the docs](https://redis.io/docs/latest/operate/oss_and_stack/install/install-stack/)\n", + "3. With docker: `docker run -d --name redis-stack-server -p 6379:6379 redis/redis-stack-server:latest`" + ] + }, + { + "cell_type": "markdown", + "id": "1d4499ae", + "metadata": {}, + "source": [ + "### Define the Redis Connection URL\n", + "\n", + "By default this notebook connects to the local instance of Redis Stack. **If you have your own Redis Enterprise instance** - replace REDIS_PASSWORD, REDIS_HOST and REDIS_PORT values with your own." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "aefda1d1", + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import warnings\n", + "\n", + "warnings.filterwarnings(\"ignore\")\n", + "\n", + "# Replace values below with your own if using Redis Cloud instance\n", + "REDIS_HOST = os.getenv(\"REDIS_HOST\", \"localhost\") # ex: \"redis-18374.c253.us-central1-1.gce.cloud.redislabs.com\"\n", + "REDIS_PORT = os.getenv(\"REDIS_PORT\", \"6379\") # ex: 18374\n", + "REDIS_PASSWORD = os.getenv(\"REDIS_PASSWORD\", \"\") # ex: \"1TNxTEdYRDgIDKM2gDfasupCADXXXX\"\n", + "\n", + "# If SSL is enabled on the endpoint, use rediss:// as the URL prefix\n", + "REDIS_URL = f\"redis://:{REDIS_PASSWORD}@{REDIS_HOST}:{REDIS_PORT}\"" + ] + }, + { + "cell_type": "markdown", + "id": "fb9ad58b", + "metadata": {}, + "source": [ + "# Allow/block list with router\n", + "\n", + "When ChatGPT first launched, there was a famous example where a car dealership accidentally made one of the latest language models available for free to everyone. They assumed users would only ask questions about cars through their chatbot. However, a group of developers quickly realized that the model was powerful enough to answer coding questions, so they started using the dealership's chatbot for free.
\n", + "\n", + "To prevent this kind of misuse in your system, adding an allow/block router to the front of your application is essential. Fortunately, this is very easy to implement using `redisvl`.
\n", + "\n", + "The code below initializes a vectorizer that will create the vectors that will be stored and initialize the `SemanticRouter` class from `redisvl` that will do the bulk of the configuration required for the router." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "c52d454a", + "metadata": {}, + "outputs": [], + "source": [ + "from redisvl.extensions.router import Route, SemanticRouter\n", + "from redisvl.utils.vectorize import HFTextVectorizer\n", + "\n", + "vectorizer = HFTextVectorizer()\n", + "\n", + "# Semantic router\n", + "blocked_references = [\n", + " \"things about aliens\",\n", + " \"corporate questions about agile\",\n", + " \"anything about the S&P 500\",\n", + "]\n", + "\n", + "blocked_route = Route(name=\"block_list\", references=blocked_references)\n", + "\n", + "block_router = SemanticRouter(\n", + " name=\"bouncer\",\n", + " vectorizer=vectorizer,\n", + " routes=[blocked_route],\n", + " redis_url=REDIS_URL,\n", + " overwrite=False,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "3b0a26f0", + "metadata": {}, + "source": [ + "# Test it out" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "b986bf8d", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "RouteMatch(name='block_list', distance=0.375403106213)" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "user_query = \"Why is agile so important?\"\n", + "\n", + "route_match = block_router(user_query)\n", + "\n", + "route_match" + ] + }, + { + "cell_type": "markdown", + "id": "e0086844", + "metadata": {}, + "source": [ + "You can see that we matched to the block list with the example. In an application flow, this value could be checked to short-circuit etc.\n", + "\n", + "This could also be implemented in the positive where examples are provided for topics you'd like to allow." + ] + }, + { + "cell_type": "markdown", + "id": "10f4cb85", + "metadata": {}, + "source": [ + "# Routing with multiple routes\n", + "\n", + "## Define the Routes\n", + "\n", + "Below we define 3 different routes. One for `technology`, one for `sports`, and\n", + "another for `entertainment`. Now for this example, the goal here is\n", + "surely topic \"classification\". But you can create routes and references for\n", + "almost anything.\n", + "\n", + "Each route has a set of references that cover the \"semantic surface area\" of the\n", + "route. The incoming query from a user needs to be semantically similar to one or\n", + "more of the references in order to \"match\" on the route." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "60ad280c", + "metadata": {}, + "outputs": [], + "source": [ + "from redisvl.extensions.router import Route\n", + "\n", + "# Define routes for the semantic router\n", + "technology = Route(\n", + " name=\"technology\",\n", + " references=[\n", + " \"what are the latest advancements in AI?\",\n", + " \"tell me about the newest gadgets\",\n", + " \"what's trending in tech?\"\n", + " ],\n", + " metadata={\"category\": \"tech\", \"priority\": 1}\n", + ")\n", + "\n", + "sports = Route(\n", + " name=\"sports\",\n", + " references=[\n", + " \"who won the game last night?\",\n", + " \"tell me about the upcoming sports events\",\n", + " \"what's the latest in the world of sports?\",\n", + " \"sports\",\n", + " \"basketball and football\"\n", + " ],\n", + " metadata={\"category\": \"sports\", \"priority\": 2}\n", + ")\n", + "\n", + "entertainment = Route(\n", + " name=\"entertainment\",\n", + " references=[\n", + " \"what are the top movies right now?\",\n", + " \"who won the best actor award?\",\n", + " \"what's new in the entertainment industry?\"\n", + " ],\n", + " metadata={\"category\": \"entertainment\", \"priority\": 3}\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "9cdbcbff", + "metadata": {}, + "source": [ + "## Initialize the SemanticRouter\n", + "\n", + "Like before the ``SemanticRouter`` class will automatically create an index within Redis upon initialization for the route references." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "e80aaf84", + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "\n", + "os.environ[\"TOKENIZERS_PARALLELISM\"] = \"false\"\n", + "\n", + "# Initialize the SemanticRouter\n", + "multi_topic_router = SemanticRouter(\n", + " name=\"topic-router\",\n", + " vectorizer=HFTextVectorizer(),\n", + " routes=[technology, sports, entertainment],\n", + " redis_url=\"redis://localhost:6379\",\n", + " overwrite=True # Blow away any other routing index with this name\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "8b199505", + "metadata": {}, + "source": [ + "## View the created index" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "3caedb77", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "Index Information:\n", + "╭──────────────┬────────────────┬──────────────────┬─────────────────┬────────────╮\n", + "│ Index Name │ Storage Type │ Prefixes │ Index Options │ Indexing │\n", + "├──────────────┼────────────────┼──────────────────┼─────────────────┼────────────┤\n", + "│ topic-router │ HASH │ ['topic-router'] │ [] │ 0 │\n", + "╰──────────────┴────────────────┴──────────────────┴─────────────────┴────────────╯\n", + "Index Fields:\n", + "╭────────────┬─────────────┬────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬─────────────────┬────────────────╮\n", + "│ Name │ Attribute │ Type │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │ Field Option │ Option Value │\n", + "├────────────┼─────────────┼────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼────────────────┼─────────────────┼────────────────┤\n", + "│ route_name │ route_name │ TAG │ SEPARATOR │ , │ │ │ │ │ │ │\n", + "│ reference │ reference │ TEXT │ WEIGHT │ 1 │ │ │ │ │ │ │\n", + "│ vector │ vector │ VECTOR │ algorithm │ FLAT │ data_type │ FLOAT32 │ dim │ 768 │ distance_metric │ COSINE │\n", + "╰────────────┴─────────────┴────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴────────────────┴─────────────────┴────────────────╯\n" + ] + } + ], + "source": [ + "# look at the index specification created for the semantic router\n", + "!rvl index info -i topic-router" + ] + }, + { + "cell_type": "markdown", + "id": "8eb95dde", + "metadata": {}, + "source": [ + "## Simple routing" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "5b0e3208", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "RouteMatch(name='technology', distance=0.119614601135)" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Query the router with a statement\n", + "route_match = multi_topic_router(\"Can you tell me about the latest in artificial intelligence?\")\n", + "route_match" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "ef90a1b8", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "RouteMatch(name=None, distance=None)" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Query the router with a statement and return a miss\n", + "route_match = multi_topic_router(\"are aliens real?\")\n", + "route_match" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "a937b471", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "RouteMatch(name='sports', distance=0.554210424423)" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Toggle the runtime distance threshold\n", + "route_match = multi_topic_router(\"Which basketball team will win the NBA finals?\", distance_threshold=0.7)\n", + "route_match" + ] + }, + { + "cell_type": "markdown", + "id": "c3f8600a", + "metadata": {}, + "source": [ + "We can also route a statement to many routes and order them by distance:" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "70335f93", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[RouteMatch(name='sports', distance=0.758580780029),\n", + " RouteMatch(name='entertainment', distance=0.812423845132),\n", + " RouteMatch(name='technology', distance=0.884235262871)]" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Perform multi-class classification with route_many() -- toggle the max_k and the distance_threshold\n", + "route_matches = multi_topic_router.route_many(\"Lebron James\", distance_threshold=1.0, max_k=3)\n", + "route_matches" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "874b80fc", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[RouteMatch(name='sports', distance=0.663254141808),\n", + " RouteMatch(name='entertainment', distance=0.712985336781),\n", + " RouteMatch(name='technology', distance=0.832674443722)]" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Toggle the aggregation method -- note the different distances in the result\n", + "from redisvl.extensions.router.schema import DistanceAggregationMethod\n", + "\n", + "route_matches = multi_topic_router.route_many(\"Lebron James\", aggregation_method=DistanceAggregationMethod.min, distance_threshold=1.0, max_k=3)\n", + "route_matches" + ] + }, + { + "cell_type": "markdown", + "id": "37c834d2", + "metadata": {}, + "source": [ + "Note the different route match distances. This is because we used the `min` aggregation method instead of the default `avg` approach.\n", + "\n", + "## Update the routing config\n" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "86919de5", + "metadata": {}, + "outputs": [], + "source": [ + "from redisvl.extensions.router import RoutingConfig\n", + "\n", + "multi_topic_router.update_routing_config(\n", + " RoutingConfig(distance_threshold=1.0, aggregation_method=DistanceAggregationMethod.min, max_k=3)\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "cb883785", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[RouteMatch(name='sports', distance=0.663254141808),\n", + " RouteMatch(name='entertainment', distance=0.712985336781),\n", + " RouteMatch(name='technology', distance=0.832674443722)]" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "route_matches = multi_topic_router.route_many(\"Lebron James\")\n", + "route_matches" + ] + }, + { + "cell_type": "markdown", + "id": "ff885cfe", + "metadata": {}, + "source": [ + "## Router serialization" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "f5ea2e61", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'name': 'topic-router',\n", + " 'routes': [{'name': 'technology',\n", + " 'references': ['what are the latest advancements in AI?',\n", + " 'tell me about the newest gadgets',\n", + " \"what's trending in tech?\"],\n", + " 'metadata': {'category': 'tech', 'priority': '1'}},\n", + " {'name': 'sports',\n", + " 'references': ['who won the game last night?',\n", + " 'tell me about the upcoming sports events',\n", + " \"what's the latest in the world of sports?\",\n", + " 'sports',\n", + " 'basketball and football'],\n", + " 'metadata': {'category': 'sports', 'priority': '2'}},\n", + " {'name': 'entertainment',\n", + " 'references': ['what are the top movies right now?',\n", + " 'who won the best actor award?',\n", + " \"what's new in the entertainment industry?\"],\n", + " 'metadata': {'category': 'entertainment', 'priority': '3'}}],\n", + " 'vectorizer': {'type': 'hf',\n", + " 'model': 'sentence-transformers/all-mpnet-base-v2'},\n", + " 'routing_config': {'distance_threshold': 1.0,\n", + " 'max_k': 3,\n", + " 'aggregation_method': 'min'}}" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "multi_topic_router.to_dict()" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "36ae6f50", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "10:50:18 redisvl.index.index INFO Index already exists, not overwriting.\n" + ] + } + ], + "source": [ + "router2 = SemanticRouter.from_dict(multi_topic_router.to_dict(), redis_url=\"redis://localhost:6379\")\n", + "\n", + "assert router2 == multi_topic_router" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "f601b065", + "metadata": {}, + "outputs": [], + "source": [ + "multi_topic_router.to_yaml(\"router.yaml\", overwrite=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "63e4a847", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "10:50:43 redisvl.index.index INFO Index already exists, not overwriting.\n" + ] + } + ], + "source": [ + "router3 = SemanticRouter.from_yaml(\"router.yaml\", redis_url=\"redis://localhost:6379\")\n", + "\n", + "assert router3 == router2 == multi_topic_router" + ] + }, + { + "cell_type": "markdown", + "id": "06b68393", + "metadata": {}, + "source": [ + "## Clean up the router" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dd68034d", + "metadata": {}, + "outputs": [], + "source": [ + "# Use clear to flush all routes from the index\n", + "multi_topic_router.clear()\n", + "block_router.clear()" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "id": "b74bc6bb", + "metadata": {}, + "outputs": [], + "source": [ + "# Use delete to clear the index and remove it completely\n", + "multi_topic_router.delete()\n", + "block_router.delete()" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "id": "b44ef673", + "metadata": {}, + "outputs": [], + "source": [ + "# remove file\n", + "!rm -rf router.yaml" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}