diff --git a/docs/presentations/materials/2024-02-28-rag-bootcamp-vector-institute.ipynb b/docs/presentations/materials/2024-02-28-rag-bootcamp-vector-institute.ipynb new file mode 100644 index 0000000000000..4524113a39298 --- /dev/null +++ b/docs/presentations/materials/2024-02-28-rag-bootcamp-vector-institute.ipynb @@ -0,0 +1,425 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "5ab60f84-39b3-4bdd-ae83-6527acb315e5", + "metadata": {}, + "source": [ + "# RAG Bootcamp ◦ February 2024 ◦ Vector Institute " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c6b62610", + "metadata": {}, + "outputs": [], + "source": [ + "##################################################################\n", + "# Venue: RAG Bootcamp - Vector Institute Canada\n", + "# Talk: RAG Bootcamp: Intro to RAG with the LlamaIndexFramework\n", + "# Speaker: Andrei Fajardo\n", + "##################################################################" + ] + }, + { + "cell_type": "markdown", + "id": "3bee89b8-a04d-4326-9392-b9e7e1bcb8af", + "metadata": {}, + "source": [ + "![Title Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/title.excalidraw.svg)" + ] + }, + { + "cell_type": "markdown", + "id": "e4d38b38-ea48-4012-81ae-84e1d1f40a69", + "metadata": {}, + "source": [ + "![Title Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/framework.excalidraw.svg)" + ] + }, + { + "cell_type": "markdown", + "id": "34d1f8e7-f978-4f19-bdfb-37c0d235b5bf", + "metadata": {}, + "source": [ + "#### Notebook Setup & Dependency Installation" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f227a52a-a147-4e8f-b7d3-e03f983fd5f1", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install llama-index llama-index-vector-stores-qdrant -q" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7bc383fc-19b2-47b5-af61-e83210ea9c37", + "metadata": {}, + "outputs": [], + "source": [ + "import nest_asyncio\n", + "\n", + "nest_asyncio.apply()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4e4c84ad", + "metadata": {}, + "outputs": [], + "source": [ + "!mkdir data\n", + "!wget \"https://arxiv.org/pdf/2402.09353.pdf\" -O \"./data/dorav1.pdf\"" + ] + }, + { + "cell_type": "markdown", + "id": "275c00f1-e358-498a-88c3-8e810a5a2546", + "metadata": {}, + "source": [ + "## Motivation\n", + "\n", + "![Motivation Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/motivation.excalidraw.svg)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "25d4ce76-8eea-44cb-aa99-94844dfed9c7", + "metadata": {}, + "outputs": [], + "source": [ + "# query an LLM and ask it about DoRA\n", + "from llama_index.llms.openai import OpenAI\n", + "\n", + "llm = OpenAI(model=\"gpt-4\")\n", + "response = llm.complete(\"What is DoRA?\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c3f18489-4f25-40ce-86e9-697ddea7d6c6", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Without specific context, it's hard to determine what DoRA refers to as it could mean different things in different fields. However, in general, DoRA could refer to:\n", + "\n", + "1. Division of Research and Analysis: In some organizations, this is a department responsible for conducting research and analyzing data.\n", + "\n", + "2. Department of Regulatory Agencies: In some U.S. states, this is a government agency responsible for consumer protection and regulation of businesses.\n", + "\n", + "3. Declaration of Research Assessment: In academia, this could refer to a statement or policy regarding how research is evaluated.\n", + "\n", + "4. Digital on-Ramp's Assessment: In the field of digital technology, this could refer to an assessment tool used by the Digital On-Ramps program.\n", + "\n", + "Please provide more context for a more accurate definition.\n" + ] + } + ], + "source": [ + "print(response.text)" + ] + }, + { + "cell_type": "markdown", + "id": "04a0ef8d-d55c-4b64-887b-18d343503a76", + "metadata": {}, + "source": [ + "## Basic RAG in 3 Steps\n", + "\n", + "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/subheading.excalidraw.svg)\n", + "\n", + "\n", + "1. Build external knowledge (i.e., updated data sources)\n", + "2. Retrieve\n", + "3. Augment and Generate" + ] + }, + { + "cell_type": "markdown", + "id": "598a5257-20ae-468e-85d6-d4e8c46b8cb5", + "metadata": {}, + "source": [ + "## 1. Build External Knowledge\n", + "\n", + "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/step1.excalidraw.svg)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a2963f90-9da5-4a0d-8dbe-f16fcb8627a3", + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"Load the data.\n", + "\n", + "With llama-index, before any transformations are applied,\n", + "data is loaded in the `Document` abstraction, which is\n", + "a container that holds the text of the document.\n", + "\"\"\"\n", + "\n", + "from llama_index.core import SimpleDirectoryReader\n", + "\n", + "loader = SimpleDirectoryReader(input_dir=\"./data\")\n", + "documents = loader.load_data()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "da321e2c-8428-4c04-abf2-b204416e816f", + "metadata": {}, + "outputs": [], + "source": [ + "# if you want to see what the text looks like\n", + "# documents[0].text" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4801e74a-8c52-45c4-967d-7a1a94f54ad3", + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"Chunk, Encode, and Store into a Vector Store.\n", + "\n", + "To streamline the process, we can make use of the IngestionPipeline\n", + "class that will apply your specified transformations to the\n", + "Document's.\n", + "\"\"\"\n", + "\n", + "from llama_index.core.ingestion import IngestionPipeline\n", + "from llama_index.core.node_parser import SentenceSplitter\n", + "from llama_index.embeddings.openai import OpenAIEmbedding\n", + "from llama_index.vector_stores.qdrant import QdrantVectorStore\n", + "import qdrant_client\n", + "\n", + "client = qdrant_client.QdrantClient(location=\":memory:\")\n", + "vector_store = QdrantVectorStore(client=client, collection_name=\"test_store\")\n", + "\n", + "pipeline = IngestionPipeline(\n", + " transformations=[\n", + " SentenceSplitter(),\n", + " OpenAIEmbedding(),\n", + " ],\n", + " vector_store=vector_store,\n", + ")\n", + "_nodes = pipeline.run(documents=documents, num_workers=4)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "02afea25-098b-49c7-a965-21c7576757af", + "metadata": {}, + "outputs": [], + "source": [ + "# if you want to see the nodes\n", + "# len(_nodes)\n", + "# _nodes[0].text" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "44cd8a86-089d-4329-9484-35b98b3a26f9", + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"Create a llama-index... wait for it... Index.\n", + "\n", + "After uploading your encoded documents into your vector\n", + "store of choice, you can connect to it with a VectorStoreIndex\n", + "which then gives you access to all of the llama-index functionality.\n", + "\"\"\"\n", + "\n", + "from llama_index.core import VectorStoreIndex\n", + "\n", + "index = VectorStoreIndex.from_vector_store(vector_store=vector_store)" + ] + }, + { + "cell_type": "markdown", + "id": "286b1827-7547-49c6-aba3-82f08d6d86b8", + "metadata": {}, + "source": [ + "## 2. Retrieve Against A Query\n", + "\n", + "![Step2 Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/step2.excalidraw.svg)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "49f86af1-db08-4641-89ad-d60abd04e6b3", + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"Retrieve relevant documents against a query.\n", + "\n", + "With our Index ready, we can now query it to\n", + "retrieve the most relevant document chunks.\n", + "\"\"\"\n", + "\n", + "retriever = index.as_retriever(similarity_top_k=2)\n", + "retrieved_nodes = retriever.retrieve(\"What is DoRA?\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "05f9ce3b-a4e3-4862-b58c-2d9fba1f9abc", + "metadata": {}, + "outputs": [], + "source": [ + "# to view the retrieved node\n", + "# print(retrieved_nodes[0].text)" + ] + }, + { + "cell_type": "markdown", + "id": "978ae2c5-8c2a-41c7-a2eb-85a5562f2db5", + "metadata": {}, + "source": [ + "## 3. Generate Final Response\n", + "\n", + "![Step3 Image](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/step3.excalidraw.svg)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ef33c349-eed4-4e35-9b5d-9473adf2ce01", + "metadata": {}, + "outputs": [], + "source": [ + "\"\"\"Context-Augemented Generation.\n", + "\n", + "With our Index ready, we can create a QueryEngine\n", + "that handles the retrieval and context augmentation\n", + "in order to get the final response.\n", + "\"\"\"\n", + "\n", + "query_engine = index.as_query_engine()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4139c48a-ece8-4244-b4eb-7cff74cb1325", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Context information is below.\n", + "---------------------\n", + "{context_str}\n", + "---------------------\n", + "Given the context information and not prior knowledge, answer the query.\n", + "Query: {query_str}\n", + "Answer: \n" + ] + } + ], + "source": [ + "# to inspect the default prompt being used\n", + "print(\n", + " query_engine.get_prompts()[\n", + " \"response_synthesizer:text_qa_template\"\n", + " ].default_template.template\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6179639d-af96-4a09-b440-b47ad599a26f", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "DoRA is a method that introduces incremental directional updates in a model by replacing them with alternative LoRA variants. It is compatible with other LoRA variants such as VeRA, which suggests freezing a unique pair of random low-rank matrices shared across all layers and employing minimal layer-specific trainable scaling vectors to capture each layer's incremental updates. DoRA effectively reduces the number of trainable parameters significantly while maintaining accuracy, showcasing improvements over other variants like VeRA and LoRA.\n" + ] + } + ], + "source": [ + "response = query_engine.query(\"What is DoRA?\")\n", + "print(response)" + ] + }, + { + "cell_type": "markdown", + "id": "dae63946-be38-4807-af2a-8113661a806b", + "metadata": {}, + "source": [ + "## In Summary\n", + "\n", + "- LLMs as powerful as they are, don't perform too well with knowledge-intensive tasks (domain specific, updated data, long-tail)\n", + "- Context augmentation has been shown (in a few studies) to outperform LLMs without augmentation\n", + "- In this notebook, we showed one such example that follows that pattern." + ] + }, + { + "cell_type": "markdown", + "id": "fc857227-3fed-4bb6-a062-99ea3c55e294", + "metadata": {}, + "source": [ + "# LlamaIndex Has More To Offer\n", + "\n", + "- Data infrastructure that enables production-grade, advanced RAG systems\n", + "- Agentic solutions\n", + "- Newly released: `llama-index-networks`\n", + "- Enterprise offerings (alpha):\n", + " - LlamaParse (proprietary complex PDF parser) and\n", + " - LlamaCloud" + ] + }, + { + "cell_type": "markdown", + "id": "17c1c027-be8b-48f4-87ee-06f3e2c71797", + "metadata": {}, + "source": [ + "### Useful links\n", + "\n", + "[website](https://www.llamaindex.ai/) ◦ [llamahub](https://llamahub.ai) ◦ [github](https://github.com/run-llama/llama_index) ◦ [medium](https://medium.com/@llama_index) ◦ [rag-bootcamp-poster](https://d3ddy8balm3goa.cloudfront.net/rag-bootcamp-vector/final_poster.excalidraw.svg)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "rag-bootcamp", + "language": "python", + "name": "rag-bootcamp" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/presentations/past_presentations.md b/docs/presentations/past_presentations.md new file mode 100644 index 0000000000000..775deb9ea37a2 --- /dev/null +++ b/docs/presentations/past_presentations.md @@ -0,0 +1,8 @@ +# List of Past Presentations + +```{toctree} +--- +maxdepth: 1 +--- +/presentations/materials/2024-02-28-rag-bootcamp-vector-institute.ipynb +```