Skip to content

wilsones-berkeley/chatGSPP

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChatGSPP ChatGSPP Logo

GitHub issues GitHub license

Overview

ChatGSPP is a privacy-focused AI chatbot designed to answer questions based on the content of a specific website. Built on top of PrivateGPT, it leverages a Retrieval-Augmented Generation (RAG) pipeline to ensure that responses are contextually relevant to the ingested website content.

Unlike traditional chatbots that rely on external knowledge sources, ChatGSPP is fully self-contained, meaning it primarily references content from the specified website to generate responses. This ensures both accuracy and privacy while enabling organizations to provide a website-specific AI assistant.


How It Works

ChatGSPP enhances PrivateGPT by implementing a website-focused workflow with the following steps:

1️⃣ Document Ingestion for Website Content

  • Parses and ingests data from the target website (e.g., pages, articles, FAQs, etc).
  • Supports web scraping or direct document uploads to convert website content into structured text.
  • Uses embedding generation to transform each content section into vector embeddings for retrieval.

2️⃣ Contextual Search and Retrieval

  • When a user submits a query:
    • The query is converted into an embedding.
    • The system searches for relevant website sections in a vector database (e.g., Qdrant).
    • The most relevant snippets are retrieved to ensure responses are based on the website’s actual content.

3️⃣ Enhanced Prompt Engineering

  • The retrieved information is formatted into a structured prompt.
  • This prompt is fed into the LLM (e.g., Llama3.1, DeepSeek R1, Gemma 2) to generate a response.
  • Responses remain accurate, domain-specific, and context-aware.

4️⃣ PrivateGPT-Powered Response Generation

  • ChatGSPP leverages PrivateGPT’s API to generate responses while ensuring privacy.
  • The LLM processes the contextual prompt and generates a coherent, website-specific answer.
  • The response is displayed in the chatbot interface.

5️⃣ Web Chatbot Interface

  • ChatGSPP features a user-friendly web-based chat interface.
  • The chatbot interacts with the PrivateGPT-powered backend to handle queries dynamically.

Why Use ChatGSPP?

  • Website-Specific Answers → Responses are strictly based on website content, reducing misinformation.
  • Dynamic Updates → As the website evolves, new content is ingested and indexed.
  • Privacy-Focused → Runs locally or in a controlled environment, ensuring sensitive website data is never shared externally.
  • Customizable → Swap out LLMs (e.g., LlamaCPP, OpenAI, GPT4All) and vector databases as needed.
  • Scalable & Efficient → Uses dependency injection and modular abstractions to make component swapping and upgrades seamless.
  • Open-Source Support → Under the Apache 2.0 license, continuing the tradition of open innovation and collaboration.

🏗 Architecture Overview

ChatGSPP builds upon PrivateGPT’s RAG pipeline while tailoring it to website-based question answering.

Core Components:

  1. API Layer (FastAPI)

    • Manages user requests via a router-service architecture.
    • Interfaces with the RAG pipeline and document embeddings.
  2. RAG Pipeline

    • Uses LlamaIndex abstractions for document ingestion, search, and prompt engineering.
    • Retrieves relevant website content based on user queries.
  3. Component Abstractions

    • LLM → Supports models like LlamaCPP or OpenAI.
    • BaseEmbedding → Generates vector embeddings from website content.
    • VectorStore → Stores and retrieves document embeddings (e.g., Qdrant, FAISS).
  4. Dependency Injection

    • Allows swapping out LLMs, embedding models, and vector stores without modifying core logic.
  5. Web Chat Interface

    • Provides a frontend for users to ask questions and receive real-time, website-specific answers.

🚀 Getting Started

Reference the official PrivateGPT docs to get started with a fully local PrivateGPT setup. From there, you can reference chatGSPP's build docs to replicate chatGSPP's configuration on your home computer.


📜 License

ChatGSPP is open-source and licensed under the Apache 2.0 license. Feel free to use, modify, and contribute!


🙌 Contributions & Support

Got ideas? Found a bug? Feel free to submit an issue or a pull request!

Partners & Supporters

PrivateGPT is actively supported by the teams behind:

  • Qdrant: Providing the default vector database.
  • Fern: Offering Documentation and SDKs.
  • LlamaIndex: Supplying the base RAG framework and abstractions.

This project is strongly influenced and supported by other amazing projects such as:

Contact

For any questions, suggestions, or contributions, please reach out to GSPP's IT Team:

About

AI (RAG) chatbot tailored for GSPP's main external website

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 79.4%
  • MDX 19.8%
  • Other 0.8%