From 4fc704f467c5273207aaa8955d15db8ab02e5bf9 Mon Sep 17 00:00:00 2001 From: Andrei Fajardo Date: Wed, 28 Feb 2024 10:47:18 -0500 Subject: [PATCH 1/4] nb for vector talk --- ...-02-28-rag-bootcamp-vector-institute.ipynb | 414 ++++++++++++++++++ 1 file changed, 414 insertions(+) create mode 100644 docs/examples/presentations/2024-02-28-rag-bootcamp-vector-institute.ipynb diff --git a/docs/examples/presentations/2024-02-28-rag-bootcamp-vector-institute.ipynb b/docs/examples/presentations/2024-02-28-rag-bootcamp-vector-institute.ipynb new file mode 100644 index 0000000000000..43383871a2f5a --- /dev/null +++ b/docs/examples/presentations/2024-02-28-rag-bootcamp-vector-institute.ipynb @@ -0,0 +1,414 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "5ab60f84-39b3-4bdd-ae83-6527acb315e5", + "metadata": {}, + "source": [ + "# RAG Bootcamp ◦ February 2024 ◦ Vector Institute " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c6b62610", + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"\n", + "Venue: RAG Bootcamp - Vector Institute Canada\n", + "Talk: RAG Bootcamp: Intro to RAG with the LlamaIndexFramework\n", + "Speaker: Andrei Fajardo\n", + "\"\"\"" + ] + }, + { + "cell_type": "markdown", + "id": "3bee89b8-a04d-4326-9392-b9e7e1bcb8af", + "metadata": {}, + "source": [ + "![Title Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/title.excalidraw.svg)" + ] + }, + { + "cell_type": "markdown", + "id": "e4d38b38-ea48-4012-81ae-84e1d1f40a69", + "metadata": {}, + "source": [ + "![Title Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/framework.excalidraw.svg)" + ] + }, + { + "cell_type": "markdown", + "id": "34d1f8e7-f978-4f19-bdfb-37c0d235b5bf", + "metadata": {}, + "source": [ + "#### Notebook Setup & Dependency Installation" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f227a52a-a147-4e8f-b7d3-e03f983fd5f1", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install llama-index llama-index-vector-stores-qdrant -q" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7bc383fc-19b2-47b5-af61-e83210ea9c37", + "metadata": {}, + "outputs": [], + "source": [ + "import nest_asyncio\n", + "\n", + "nest_asyncio.apply()" + ] + }, + { + "cell_type": "markdown", + "id": "275c00f1-e358-498a-88c3-8e810a5a2546", + "metadata": {}, + "source": [ + "## Motivation\n", + "\n", + "![Motivation Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/motivation.excalidraw.svg)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "25d4ce76-8eea-44cb-aa99-94844dfed9c7", + "metadata": {}, + "outputs": [], + "source": [ + "# query an LLM and ask it about DoRA\n", + "from llama_index.llms.openai import OpenAI\n", + "\n", + "llm = OpenAI(model=\"gpt-4\")\n", + "response = llm.complete(\"What is DoRA?\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c3f18489-4f25-40ce-86e9-697ddea7d6c6", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Without specific context, it's hard to determine what DoRA refers to as it could mean different things in different fields. However, in general, DoRA could refer to:\n", + "\n", + "1. Division of Research and Analysis: In some organizations, this is a department responsible for conducting research and analyzing data.\n", + "\n", + "2. Department of Regulatory Agencies: In some U.S. states, this is a government agency responsible for consumer protection and regulation of businesses.\n", + "\n", + "3. Declaration of Research Assessment: In academia, this could refer to a statement or policy regarding how research is evaluated.\n", + "\n", + "4. Digital on-Ramp's Assessment: In the field of digital technology, this could refer to an assessment tool used by the Digital On-Ramps program.\n", + "\n", + "Please provide more context for a more accurate definition.\n" + ] + } + ], + "source": [ + "print(response.text)" + ] + }, + { + "cell_type": "markdown", + "id": "04a0ef8d-d55c-4b64-887b-18d343503a76", + "metadata": {}, + "source": [ + "## Basic RAG in 3 Steps\n", + "\n", + "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/subheading.excalidraw.svg)\n", + "\n", + "\n", + "1. Build external knowledge (i.e., updated data sources)\n", + "2. Retrieve\n", + "3. Augment and Generate" + ] + }, + { + "cell_type": "markdown", + "id": "598a5257-20ae-468e-85d6-d4e8c46b8cb5", + "metadata": {}, + "source": [ + "## 1. Build External Knowledge\n", + "\n", + "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/step1.excalidraw.svg)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a2963f90-9da5-4a0d-8dbe-f16fcb8627a3", + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"Load the data.\n", + "\n", + "With llama-index, before any transformations are applied,\n", + "data is loaded in the `Document` abstraction, which is\n", + "a container that holds the text of the document.\n", + "\"\"\"\n", + "\n", + "from llama_index.core import SimpleDirectoryReader\n", + "\n", + "loader = SimpleDirectoryReader(input_dir=\"./data\")\n", + "documents = loader.load_data()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "da321e2c-8428-4c04-abf2-b204416e816f", + "metadata": {}, + "outputs": [], + "source": [ + "# if you want to see what the text looks like\n", + "# documents[0].text" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4801e74a-8c52-45c4-967d-7a1a94f54ad3", + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"Chunk, Encode, and Store into a Vector Store.\n", + "\n", + "To streamline the process, we can make use of the IngestionPipeline\n", + "class that will apply your specified transformations to the\n", + "Document's.\n", + "\"\"\"\n", + "\n", + "from llama_index.core.ingestion import IngestionPipeline\n", + "from llama_index.core.node_parser import SentenceSplitter\n", + "from llama_index.embeddings.openai import OpenAIEmbedding\n", + "from llama_index.vector_stores.qdrant import QdrantVectorStore\n", + "import qdrant_client\n", + "\n", + "client = qdrant_client.QdrantClient(location=\":memory:\")\n", + "vector_store = QdrantVectorStore(client=client, collection_name=\"test_store\")\n", + "\n", + "pipeline = IngestionPipeline(\n", + " transformations=[\n", + " SentenceSplitter(),\n", + " OpenAIEmbedding(),\n", + " ],\n", + " vector_store=vector_store,\n", + ")\n", + "_nodes = pipeline.run(documents=documents, num_workers=4)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "02afea25-098b-49c7-a965-21c7576757af", + "metadata": {}, + "outputs": [], + "source": [ + "# if you want to see the nodes\n", + "# len(_nodes)\n", + "# _nodes[0].text" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "44cd8a86-089d-4329-9484-35b98b3a26f9", + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"Create a llama-index... wait for it... Index.\n", + "\n", + "After uploading your encoded documents into your vector\n", + "store of choice, you can connect to it with a VectorStoreIndex\n", + "which then gives you access to all of the llama-index functionality.\n", + "\"\"\"\n", + "\n", + "from llama_index.core import VectorStoreIndex\n", + "\n", + "index = VectorStoreIndex.from_vector_store(vector_store=vector_store)" + ] + }, + { + "cell_type": "markdown", + "id": "286b1827-7547-49c6-aba3-82f08d6d86b8", + "metadata": {}, + "source": [ + "## 2. Retrieve Against A Query\n", + "\n", + "![Step2 Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/step2.excalidraw.svg)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "49f86af1-db08-4641-89ad-d60abd04e6b3", + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"Retrieve relevant documents against a query.\n", + "\n", + "With our Index ready, we can now query it to\n", + "retrieve the most relevant document chunks.\n", + "\"\"\"\n", + "\n", + "retriever = index.as_retriever(similarity_top_k=2)\n", + "retrieved_nodes = retriever.retrieve(\"What is DoRA?\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "05f9ce3b-a4e3-4862-b58c-2d9fba1f9abc", + "metadata": {}, + "outputs": [], + "source": [ + "# to view the retrieved node\n", + "# print(retrieved_nodes[0].text)" + ] + }, + { + "cell_type": "markdown", + "id": "978ae2c5-8c2a-41c7-a2eb-85a5562f2db5", + "metadata": {}, + "source": [ + "## 3. Generate Final Response\n", + "\n", + "![Step3 Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/step3.excalidraw.svg)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ef33c349-eed4-4e35-9b5d-9473adf2ce01", + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"Context-Augemented Generation.\n", + "\n", + "With our Index ready, we can create a QueryEngine\n", + "that handles the retrieval and context augmentation\n", + "in order to get the final response.\n", + "\"\"\"\n", + "\n", + "query_engine = index.as_query_engine()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4139c48a-ece8-4244-b4eb-7cff74cb1325", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Context information is below.\n", + "---------------------\n", + "{context_str}\n", + "---------------------\n", + "Given the context information and not prior knowledge, answer the query.\n", + "Query: {query_str}\n", + "Answer: \n" + ] + } + ], + "source": [ + "# to inspect the default prompt being used\n", + "print(\n", + " query_engine.get_prompts()[\n", + " \"response_synthesizer:text_qa_template\"\n", + " ].default_template.template\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6179639d-af96-4a09-b440-b47ad599a26f", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "DoRA is a method that introduces incremental directional updates in a model by replacing them with alternative LoRA variants. It is compatible with other LoRA variants such as VeRA, which suggests freezing a unique pair of random low-rank matrices shared across all layers and employing minimal layer-specific trainable scaling vectors to capture each layer's incremental updates. DoRA effectively reduces the number of trainable parameters significantly while maintaining accuracy, showcasing improvements over other variants like VeRA and LoRA.\n" + ] + } + ], + "source": [ + "response = query_engine.query(\"What is DoRA?\")\n", + "print(response)" + ] + }, + { + "cell_type": "markdown", + "id": "dae63946-be38-4807-af2a-8113661a806b", + "metadata": {}, + "source": [ + "## In Summary\n", + "\n", + "- LLMs as powerful as they are, don't perform too well with knowledge-intensive tasks (domain specific, updated data, long-tail)\n", + "- Context augmentation has been shown (in a few studies) to outperform LLMs without augmentation\n", + "- In this notebook, we showed one such example that follows that pattern." + ] + }, + { + "cell_type": "markdown", + "id": "fc857227-3fed-4bb6-a062-99ea3c55e294", + "metadata": {}, + "source": [ + "# LlamaIndex Has More To Offer\n", + "\n", + "- Data infrastructure that enables production-grade, advanced RAG systems\n", + "- Agentic solutions\n", + "- Newly released: `llama-index-networks`\n", + "- Enterprise offerings (alpha):\n", + " - LlamaParse (proprietary complex PDF parser) and\n", + " - LlamaCloud" + ] + }, + { + "cell_type": "markdown", + "id": "17c1c027-be8b-48f4-87ee-06f3e2c71797", + "metadata": {}, + "source": [ + "### Useful links\n", + "\n", + "[website](https://www.llamaindex.ai/) ◦ [llamahub](https://llamahub.ai) ◦ [github](https://github.com/run-llama/llama_index) ◦ [medium](https://medium.com/@llama_index) ◦ [rag-bootcamp-poster](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/final_poster.excalidraw.svg)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "rag-bootcamp", + "language": "python", + "name": "rag-bootcamp" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From bac11872b8a1750b32d93997bab35bbadb5ac5df Mon Sep 17 00:00:00 2001 From: Andrei Fajardo Date: Wed, 28 Feb 2024 10:52:57 -0500 Subject: [PATCH 2/4] add download of paper --- ...-02-28-rag-bootcamp-vector-institute.ipynb | 21 ++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/docs/examples/presentations/2024-02-28-rag-bootcamp-vector-institute.ipynb b/docs/examples/presentations/2024-02-28-rag-bootcamp-vector-institute.ipynb index 43383871a2f5a..4524113a39298 100644 --- a/docs/examples/presentations/2024-02-28-rag-bootcamp-vector-institute.ipynb +++ b/docs/examples/presentations/2024-02-28-rag-bootcamp-vector-institute.ipynb @@ -15,11 +15,11 @@ "metadata": {}, "outputs": [], "source": [ - "\"\"\"\n", - "Venue: RAG Bootcamp - Vector Institute Canada\n", - "Talk: RAG Bootcamp: Intro to RAG with the LlamaIndexFramework\n", - "Speaker: Andrei Fajardo\n", - "\"\"\"" + "##################################################################\n", + "# Venue: RAG Bootcamp - Vector Institute Canada\n", + "# Talk: RAG Bootcamp: Intro to RAG with the LlamaIndexFramework\n", + "# Speaker: Andrei Fajardo\n", + "##################################################################" ] }, { @@ -68,6 +68,17 @@ "nest_asyncio.apply()" ] }, + { + "cell_type": "code", + "execution_count": null, + "id": "4e4c84ad", + "metadata": {}, + "outputs": [], + "source": [ + "!mkdir data\n", + "!wget \"https://arxiv.org/pdf/2402.09353.pdf\" -O \"./data/dorav1.pdf\"" + ] + }, { "cell_type": "markdown", "id": "275c00f1-e358-498a-88c3-8e810a5a2546", From 2b8927c56ee723a12f94cd08baa86e6d3e58fba2 Mon Sep 17 00:00:00 2001 From: Andrei Fajardo Date: Wed, 28 Feb 2024 11:26:55 -0500 Subject: [PATCH 3/4] mv to more appropriate folder --- .../materials}/2024-02-28-rag-bootcamp-vector-institute.ipynb | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/{examples/presentations => presentations/materials}/2024-02-28-rag-bootcamp-vector-institute.ipynb (100%) diff --git a/docs/examples/presentations/2024-02-28-rag-bootcamp-vector-institute.ipynb b/docs/presentations/materials/2024-02-28-rag-bootcamp-vector-institute.ipynb similarity index 100% rename from docs/examples/presentations/2024-02-28-rag-bootcamp-vector-institute.ipynb rename to docs/presentations/materials/2024-02-28-rag-bootcamp-vector-institute.ipynb From 7538432239d5855f9ad9288376f9c8dee6e06351 Mon Sep 17 00:00:00 2001 From: Andrei Fajardo Date: Wed, 28 Feb 2024 11:39:06 -0500 Subject: [PATCH 4/4] link in docs --- docs/presentations/past_presentations.md | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 docs/presentations/past_presentations.md diff --git a/docs/presentations/past_presentations.md b/docs/presentations/past_presentations.md new file mode 100644 index 0000000000000..775deb9ea37a2 --- /dev/null +++ b/docs/presentations/past_presentations.md @@ -0,0 +1,8 @@ +# List of Past Presentations + +```{toctree} +--- +maxdepth: 1 +--- +/presentations/materials/2024-02-28-rag-bootcamp-vector-institute.ipynb +```