ChatGSPP is a privacy-focused AI chatbot designed to answer questions based on the content of a specific website. Built on top of PrivateGPT, it leverages a Retrieval-Augmented Generation (RAG) pipeline to ensure that responses are contextually relevant to the ingested website content.
Unlike traditional chatbots that rely on external knowledge sources, ChatGSPP is fully self-contained, meaning it primarily references content from the specified website to generate responses. This ensures both accuracy and privacy while enabling organizations to provide a website-specific AI assistant.
ChatGSPP enhances PrivateGPT by implementing a website-focused workflow with the following steps:
- Parses and ingests data from the target website (e.g., pages, articles, FAQs, etc).
- Supports web scraping or direct document uploads to convert website content into structured text.
- Uses embedding generation to transform each content section into vector embeddings for retrieval.
- When a user submits a query:
- The query is converted into an embedding.
- The system searches for relevant website sections in a vector database (e.g., Qdrant).
- The most relevant snippets are retrieved to ensure responses are based on the website’s actual content.
- The retrieved information is formatted into a structured prompt.
- This prompt is fed into the LLM (e.g., Llama3.1, DeepSeek R1, Gemma 2) to generate a response.
- Responses remain accurate, domain-specific, and context-aware.
- ChatGSPP leverages PrivateGPT’s API to generate responses while ensuring privacy.
- The LLM processes the contextual prompt and generates a coherent, website-specific answer.
- The response is displayed in the chatbot interface.
- ChatGSPP features a user-friendly web-based chat interface.
- The chatbot interacts with the PrivateGPT-powered backend to handle queries dynamically.
- Website-Specific Answers → Responses are strictly based on website content, reducing misinformation.
- Dynamic Updates → As the website evolves, new content is ingested and indexed.
- Privacy-Focused → Runs locally or in a controlled environment, ensuring sensitive website data is never shared externally.
- Customizable → Swap out LLMs (e.g., LlamaCPP, OpenAI, GPT4All) and vector databases as needed.
- Scalable & Efficient → Uses dependency injection and modular abstractions to make component swapping and upgrades seamless.
- Open-Source Support → Under the Apache 2.0 license, continuing the tradition of open innovation and collaboration.
ChatGSPP builds upon PrivateGPT’s RAG pipeline while tailoring it to website-based question answering.
-
API Layer (FastAPI)
- Manages user requests via a router-service architecture.
- Interfaces with the RAG pipeline and document embeddings.
-
RAG Pipeline
- Uses LlamaIndex abstractions for document ingestion, search, and prompt engineering.
- Retrieves relevant website content based on user queries.
-
Component Abstractions
- LLM → Supports models like LlamaCPP or OpenAI.
- BaseEmbedding → Generates vector embeddings from website content.
- VectorStore → Stores and retrieves document embeddings (e.g., Qdrant, FAISS).
-
Dependency Injection
- Allows swapping out LLMs, embedding models, and vector stores without modifying core logic.
-
Web Chat Interface
- Provides a frontend for users to ask questions and receive real-time, website-specific answers.
Reference the official PrivateGPT docs to get started with a fully local PrivateGPT setup. From there, you can reference chatGSPP's build docs to replicate chatGSPP's configuration on your home computer.
ChatGSPP is open-source and licensed under the Apache 2.0 license. Feel free to use, modify, and contribute!
Got ideas? Found a bug? Feel free to submit an issue or a pull request!
- GitHub Issues: Report Bugs
- Discussions: Join the Community
PrivateGPT is actively supported by the teams behind:
- Qdrant: Providing the default vector database.
- Fern: Offering Documentation and SDKs.
- LlamaIndex: Supplying the base RAG framework and abstractions.
This project is strongly influenced and supported by other amazing projects such as:
For any questions, suggestions, or contributions, please reach out to GSPP's IT Team:
- Email: [email protected]
- GitHub Issues: ChatGSPP Issues
