Skip to content

Latest commit

 

History

History
76 lines (54 loc) · 2.53 KB

File metadata and controls

76 lines (54 loc) · 2.53 KB

Basic RAG Using LlamaIndex

Example Features

This example deploys a basic RAG pipeline for chat Q&A and serves inferencing from an NVIDIA API Catalog endpoint. You do not need a GPU on your machine to run this example.

Model Embedding Framework Vector Database File Types
meta/llama3-8b-instruct nvidia/nv-embedqa-e5-v5 LlamaIndex Milvus HTML, TXT, PDF, MD, DOCX, PPTX, XLSX

Diagram

Prerequisites

Complete the common prerequisites.

Build and Start the Containers

  1. Export your NVIDIA API key as an environment variable:

    export NVIDIA_API_KEY="nvapi-<...>"
    
  2. Start the containers:

    cd RAG/examples/basic_rag/llamaindex/
    docker compose up -d --build

    Example Output

     ✔ Network nvidia-rag           Created
     ✔ Container rag-playground     Started
     ✔ Container milvus-minio       Started
     ✔ Container chain-server       Started
     ✔ Container milvus-etcd        Started
     ✔ Container milvus-standalone  Started
    
  3. Confirm the containers are running:

    docker ps --format "table {{.ID}}\t{{.Names}}\t{{.Status}}"

    Example Output

    CONTAINER ID   NAMES               STATUS
    39a8524829da   rag-playground      Up 2 minutes
    bfbd0193dbd2   chain-server        Up 2 minutes
    ec02ff3cc58b   milvus-standalone   Up 3 minutes
    6969cf5b4342   milvus-minio        Up 3 minutes (healthy)
    57a068d62fbb   milvus-etcd         Up 3 minutes (healthy)
    
  4. Open a web browser and access http://localhost:8090 to use the RAG Playground.

    Refer to Using the Sample Web Application for information about uploading documents and using the web interface.

Next Steps