Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use NVIDIA endpoints in example notebooks and demonstrate using milvus as a vector database #141

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
270 changes: 79 additions & 191 deletions examples/langchain_multimodal_rag.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,34 +10,36 @@
},
{
"cell_type": "markdown",
"id": "12e78be0-9abe-4d1a-8aaa-61230b059792",
"id": "91ece9e3-155a-44f4-81e5-2f9492c62a2f",
"metadata": {},
"source": [
"This cookbook shows how to use LangChain to query the table and text extraction results of nv-ingest's pdf extraction tools"
"This notebook shows how to perform RAG on the table, chart, and text extraction results of nv-ingest's pdf extraction tools using LangChain"
]
},
{
"cell_type": "markdown",
"id": "cdfad056-baef-448e-b5b3-4544e6b06472",
"id": "81014734-f765-48fc-8fc2-4c19f5f28eae",
"metadata": {},
"source": [
"To start we'll need to make sure we have some dependencies installed"
"To start we'll need to make sure we have Langchain installed as well as pymilvus so that we can connect to the Milvus vector database (VDB) that NV-Ingest uses to store embeddings"
]
},
{
"cell_type": "raw",
"id": "2dcc96cf-f8e1-4baa-b1ca-b6521d23dec8",
"cell_type": "code",
"execution_count": null,
"id": "bacbe052-4429-4c0a-8b1e-309ac55ad8fb",
"metadata": {},
"outputs": [],
"source": [
"pip install langchain langchain_community langchain_chroma langchain-openai"
"pip install -qU langchain langchain_community langchain-nvidia-ai-endpoints langchain_milvus pymilvus"
]
},
{
"cell_type": "markdown",
"id": "2487f1eb-eea8-46f1-9ab0-0fd3bea77bd5",
"id": "d888ba26-04cf-4577-81a3-5bcd537fc2f6",
"metadata": {},
"source": [
"Then, we'll use nv-ingest to parse an example pdf that contains text, tables, charts, and images. We'll need to make sure to have the nv-ingest microservice up and running at localhost:7670 along with the supporting NIMs. To do this, follow the nv-ingest [quickstart guide](https://github.com/NVIDIA/nv-ingest?tab=readme-ov-file#quickstart). Once the microservice is ready we can create a job with the nv-ingest python client"
"Then, we'll use NV-Ingest to parse an example pdf that contains text, tables, charts, and images, embed it with the included embedding microservice and store the results in the Milvus vector database. We'll need to make sure to have the NV-Ingest microservice up and running at localhost:7670 along with the supporting NIMs and microservices. To do this, follow the nv-ingest [quickstart guide](https://github.com/NVIDIA/nv-ingest?tab=readme-ov-file#quickstart). This notebook requires all of the services to be [running](https://github.com/NVIDIA/nv-ingest/blob/main/docs/deployment.md#launch-nv-ingest-micro-services). Once everything is ready, we can create a job with the NV-Ingest python client"
]
},
{
Expand All @@ -50,7 +52,10 @@
"from nv_ingest_client.client import NvIngestClient\n",
"from nv_ingest_client.primitives import JobSpec\n",
"from nv_ingest_client.primitives.tasks import ExtractTask\n",
"from nv_ingest_client.primitives.tasks import SplitTask\n",
"from nv_ingest_client.primitives.tasks import EmbedTask\n",
"from nv_ingest_client.primitives.tasks import VdbUploadTask\n",
"\n",
"\n",
"from nv_ingest_client.util.file_processing.extract import extract_file_content\n",
"import logging, time\n",
"\n",
Expand All @@ -59,21 +64,31 @@
"file_name = \"../data/multimodal_test.pdf\"\n",
"file_content, file_type = extract_file_content(file_name)\n",
"\n",
"client = NvIngestClient(\n",
" message_client_hostname=\"localhost\",\n",
" message_client_port=7670\n",
")\n",
"\n",
"job_spec = JobSpec(\n",
" document_type=file_type,\n",
" payload=file_content,\n",
" source_id=file_name,\n",
" source_name=file_name,\n",
" extended_options={\"tracing_options\": {\"trace\": True, \"ts_send\": time.time_ns()}},\n",
" extended_options={\n",
" \"tracing_options\": {\n",
" \"trace\": True,\n",
" \"ts_send\": time.time_ns()\n",
" }\n",
" },\n",
")"
]
},
{
"cell_type": "markdown",
"id": "5cedc329-e9a1-428e-9fa1-2aee965c2533",
"id": "7fa987e6-8bb1-4ed1-97c2-540e136e2d19",
"metadata": {},
"source": [
"And then we can and submit a task to extract the text and tables from the example pdf"
"And then, we can add and submit tasks to extract the text, tables, and charts from the example pdf, generate embeddings from the results, and store them in the Milvus VDB"
]
},
{
Expand All @@ -88,206 +103,99 @@
" extract_text=True,\n",
" extract_images=False,\n",
" extract_tables=True,\n",
" extract_charts=True,\n",
")\n",
"\n",
"embed_task = EmbedTask(\n",
" text=True,\n",
" tables=True,\n",
")\n",
"\n",
"vdb_upload_task = VdbUploadTask()\n",
"\n",
"job_spec.add_task(extract_task)\n",
"client = NvIngestClient()\n",
"job_spec.add_task(embed_task)\n",
"job_spec.add_task(vdb_upload_task)\n",
"\n",
"job_id = client.add_job(job_spec)\n",
"\n",
"client.submit_job(job_id, \"morpheus_task_queue\")\n",
"\n",
"result = client.fetch_job_result(job_id, timeout=60)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "bf097db1-9699-442e-88b7-7b67a8a2fecf",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'document_type': 'text',\n",
" 'metadata': {'content': 'TestingDocument\\r\\nA sample document with headings and placeholder text\\r\\nIntroduction\\r\\nThis is a placeholder document that can be used for any purpose. It contains some \\r\\nheadings and some placeholder text to fill the space. The text is not important and contains \\r\\nno real value, but it is useful for testing. Below, we will have some simple tables and charts \\r\\nthat we can use to confirm Ingest is working as expected.\\r\\nTable 1\\r\\nThis table describes some animals, and some activities they might be doing in specific \\r\\nlocations.\\r\\nAnimal Activity Place\\r\\nGira@e Driving a car At the beach\\r\\nLion Putting on sunscreen At the park\\r\\nCat Jumping onto a laptop In a home o@ice\\r\\nDog Chasing a squirrel In the front yard\\r\\nChart 1\\r\\nThis chart shows some gadgets, and some very fictitious costs. Section One\\r\\nThis is the first section of the document. It has some more placeholder text to show how \\r\\nthe document looks like. The text is not meant to be meaningful or informative, but rather to \\r\\ndemonstrate the layout and formatting of the document.\\r\\n• This is the first bullet point\\r\\n• This is the second bullet point\\r\\n• This is the third bullet point\\r\\nSection Two\\r\\nThis is the second section of the document. It is more of the same as we’ve seen in the rest \\r\\nof the document. The content is meaningless, but the intent is to create a very simple \\r\\nsmoke test to ensure extraction is working as intended. This will be used in CI as time goes \\r\\non to ensure that changes we make to the library do not negatively impact our accuracy.\\r\\nTable 2\\r\\nThis table shows some popular colors that cars might come in.\\r\\nCar Color1 Color2 Color3\\r\\nCoupe White Silver Flat Gray\\r\\nSedan White Metallic Gray Matte Gray\\r\\nMinivan Gray Beige Black\\r\\nTruck Dark Gray Titanium Gray Charcoal\\r\\nConvertible Light Gray Graphite Slate Gray\\r\\nPicture\\r\\nBelow, is a high-quality picture of some shapes. Chart 2\\r\\nThis chart shows some average frequency ranges for speaker drivers.\\r\\nConclusion\\r\\nThis is the conclusion of the document. It has some more placeholder text, but the most \\r\\nimportant thing is that this is the conclusion. As we end this document, we should have \\r\\nbeen able to extract 2 tables, 2 charts, and some text including 3 bullet points.',\n",
" 'content_metadata': {'description': 'Unstructured text from PDF document.',\n",
" 'hierarchy': {'block': -1,\n",
" 'line': -1,\n",
" 'nearby_objects': {'images': {'bbox': [], 'content': []},\n",
" 'structured': {'bbox': [], 'content': []},\n",
" 'text': {'bbox': [], 'content': []}},\n",
" 'page': -1,\n",
" 'page_count': 3,\n",
" 'span': -1},\n",
" 'page_number': -1,\n",
" 'subtype': '',\n",
" 'type': 'text'},\n",
" 'debug_metadata': None,\n",
" 'embedding': None,\n",
" 'error_metadata': None,\n",
" 'image_metadata': None,\n",
" 'info_message_metadata': None,\n",
" 'raise_on_failure': False,\n",
" 'source_metadata': {'access_level': 1,\n",
" 'collection_id': '',\n",
" 'date_created': '2024-09-04T15:55:15.103673',\n",
" 'last_modified': '2024-09-04T15:55:15.103335',\n",
" 'partition_id': -1,\n",
" 'source_id': '../data/multimodal_test.pdf',\n",
" 'source_location': '',\n",
" 'source_name': '../data/multimodal_test.pdf',\n",
" 'source_type': 'PDF',\n",
" 'summary': ''},\n",
" 'table_metadata': None,\n",
" 'text_metadata': {'keywords': '',\n",
" 'language': 'en',\n",
" 'summary': '',\n",
" 'text_location': [-1, -1, -1, -1],\n",
" 'text_type': 'document'}}}"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result[0][0][0]"
]
},
{
"cell_type": "markdown",
"id": "72d85eb5-2d66-4d6d-95fe-f69b2c6c5489",
"metadata": {},
"source": [
"Now, we have the extraction results in the nv-ingest metadata format which we'll grab the extracted content from and load into Langchain documents"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "b0c6d966-9a38-4b1b-b5f2-cbe214d1718d",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.documents import Document\n",
"\n",
"texts = []\n",
"tables = []\n",
"for element in result[0][0]:\n",
" if element['document_type'] == 'text':\n",
" texts.append(Document(element['metadata']['content']))\n",
" elif element['document_type'] == 'structured':\n",
" tables.append(Document(element['metadata']['table_metadata']['table_content']))"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "7bb4b918-9029-4ffc-badd-dddc4022a750",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='TestingDocument\\r\\nA sample document with headings and placeholder text\\r\\nIntroduction\\r\\nThis is a placeholder document that can be used for any purpose. It contains some \\r\\nheadings and some placeholder text to fill the space. The text is not important and contains \\r\\nno real value, but it is useful for testing. Below, we will have some simple tables and charts \\r\\nthat we can use to confirm Ingest is working as expected.\\r\\nTable 1\\r\\nThis table describes some animals, and some activities they might be doing in specific \\r\\nlocations.\\r\\nAnimal Activity Place\\r\\nGira@e Driving a car At the beach\\r\\nLion Putting on sunscreen At the park\\r\\nCat Jumping onto a laptop In a home o@ice\\r\\nDog Chasing a squirrel In the front yard\\r\\nChart 1\\r\\nThis chart shows some gadgets, and some very fictitious costs. Section One\\r\\nThis is the first section of the document. It has some more placeholder text to show how \\r\\nthe document looks like. The text is not meant to be meaningful or informative, but rather to \\r\\ndemonstrate the layout and formatting of the document.\\r\\n• This is the first bullet point\\r\\n• This is the second bullet point\\r\\n• This is the third bullet point\\r\\nSection Two\\r\\nThis is the second section of the document. It is more of the same as we’ve seen in the rest \\r\\nof the document. The content is meaningless, but the intent is to create a very simple \\r\\nsmoke test to ensure extraction is working as intended. This will be used in CI as time goes \\r\\non to ensure that changes we make to the library do not negatively impact our accuracy.\\r\\nTable 2\\r\\nThis table shows some popular colors that cars might come in.\\r\\nCar Color1 Color2 Color3\\r\\nCoupe White Silver Flat Gray\\r\\nSedan White Metallic Gray Matte Gray\\r\\nMinivan Gray Beige Black\\r\\nTruck Dark Gray Titanium Gray Charcoal\\r\\nConvertible Light Gray Graphite Slate Gray\\r\\nPicture\\r\\nBelow, is a high-quality picture of some shapes. Chart 2\\r\\nThis chart shows some average frequency ranges for speaker drivers.\\r\\nConclusion\\r\\nThis is the conclusion of the document. It has some more placeholder text, but the most \\r\\nimportant thing is that this is the conclusion. As we end this document, we should have \\r\\nbeen able to extract 2 tables, 2 charts, and some text including 3 bullet points.')]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"texts"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "307fdf5d-587b-448b-8b79-e4fa88fb5b0c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='locations. Animal Activity Place Giraffe Driving a car At the beach Lion Putting on sunscreen At the park Cat Jumping onto a laptop In a home office Dog Chasing a squirrel In the front yard'),\n",
" Document(page_content='This chart shows some gadgets, and some very fictitious costs. >\\\\n7938.758 ext. Print & Maroon Bookshelf Fine Art Poems Collection dla Cemicon Diamtháhn | Gadgets and their cost\\nSollywood for Coasters | 19875.075 t158.281 \\n Hammer | 19871.55 \\n Powerdrill | 12044.625 \\n Bluetooth speaker | 7598.07 \\n Minifridge | 9916.305 \\n Premium desk'),\n",
" Document(page_content='This table shows some popular colors that cars might come in. Car Color1 Color2 Color3 Coupe White Silver Flat Gray Sedan White Metallic Gray Matte Gray Minivan Gray Beige Black Truck Dark Gray Titanium Gray Charcoal Convertible Light Gray Graphite Slate Gray'),\n",
" Document(page_content='This chart shows some average frequency ranges for speaker drivers TITLE | Chart 2 \\n Frequency Range Start (Hz) | Frequency Range Start (Hz) | Frequency Range End (Hz) \\n Twitter | 12800 | 12700 \\n Midrange | 13900 | 13000 \\n Midwoofer | 9600 | 13000 \\n Subwoofer | 0.00 | 13000')]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tables"
]
},
{
"cell_type": "markdown",
"id": "15410af0-fa5c-4aed-b554-f2165c6f482e",
"id": "02131711-31bf-4536-81b7-8c464c7473e3",
"metadata": {},
"source": [
"Next, we'll set our OpenAI API key and create a vector store to embed and store our text and table documents using OpenAI's embedding model"
"Now, the text, table, and chart content is extracted and stored in the Milvus VDB along with the embeddings. Next we'll connect LlamaIndex to Milvus and create a vector store so that we can query our extraction results. The vector store must use the same embedding model as the embedding service in NV-Ingest: `nv-embed-qa-e5-v5`"
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 13,
"id": "53957974-c688-4521-8c61-09f2649d5d53",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from langchain_chroma import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n",
"from langchain_milvus import Milvus\n",
"\n",
"# TODO: Add your OpenAI API key here\n",
"os.environ[\"OPENAI_API_KEY\"] = \"<YOUR_OPENAI_API_KEY>\"\n",
"embedding = NVIDIAEmbeddings(base_url=\"http://localhost:8012/v1\", model=\"nvidia/nv-embedqa-e5-v5\")\n",
"\n",
"vectorstore = Chroma.from_documents(documents=(texts+tables), embedding=OpenAIEmbeddings())"
"vectorstore = Milvus(\n",
" embedding_function=embedding,\n",
" collection_name=\"nv_ingest_collection\",\n",
" primary_field = \"pk\",\n",
" vector_field = \"vector\",\n",
" text_field=\"text\",\n",
" connection_args={\"uri\": \"http://localhost:19530\"},\n",
")\n",
"retriever = vectorstore.as_retriever()"
]
},
{
"cell_type": "markdown",
"id": "12d14f63-31b8-4d72-a029-a54328ff1c5b",
"id": "b87111b5-e5a8-45a0-9663-2ae6d9ea2ab6",
"metadata": {},
"source": [
"Then, we'll create a retriever from our vector score that will allow us to retrieve our documents by semantic similarity and an llm to synthesize the final answer from the retrieved documents"
"Finally, we'll create an RAG chain using [llama-3.1-405b-instruct](https://build.nvidia.com/meta/llama-3_1-405b-instruct) that we can use to query our pdf in natural language"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "a12f53f7-2407-4890-9586-82956cfccf13",
"execution_count": 3,
"id": "b4c9e109-395c-40e2-a1a5-e0c0ef217e24",
"metadata": {},
"outputs": [],
"source": [
"from langchain_openai import ChatOpenAI\n",
"import os \n",
"from langchain_nvidia_ai_endpoints import ChatNVIDIA\n",
"\n",
"retriever = vectorstore.as_retriever()\n",
"# TODO: Add your NVIDIA API key\n",
"os.environ[\"NVIDIA_API_KEY\"] = \"<YOUR_NVIDIA_API_KEY>\"\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-4o-mini\")"
]
},
{
"cell_type": "markdown",
"id": "b87111b5-e5a8-45a0-9663-2ae6d9ea2ab6",
"metadata": {},
"source": [
"Finally, we'll create an RAG chain that we can use to query our pdf in natural language"
"llm = ChatNVIDIA(model=\"meta/llama-3.1-405b-instruct\")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d16331bb-95dd-46d7-8b71-9d966a8ef3ba",
"execution_count": 16,
"id": "b547a19a-9ada-4a40-a246-6d7bc4d24482",
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"'The dog is chasing a squirrel in the front yard.'"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_core.prompts import PromptTemplate\n",
"from langchain_core.runnables import RunnablePassthrough\n",
Expand All @@ -310,34 +218,14 @@
" | prompt\n",
" | llm\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "b547a19a-9ada-4a40-a246-6d7bc4d24482",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'The dog is chasing a squirrel in the front yard.'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
")\n",
"rag_chain.invoke(\"What is the dog doing and where?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dcbcfcb7-c6c6-4112-86c4-39e6975d325b",
"id": "b5b3f079-65a6-4d32-a190-1df96925c5c7",
"metadata": {},
"outputs": [],
"source": []
Expand All @@ -359,7 +247,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.14"
"version": "3.10.15"
}
},
"nbformat": 4,
Expand Down
Loading
Loading