ChatGSPP

Overview

ChatGSPP is a privacy-focused AI chatbot designed to answer questions based on the content of a specific website. Built on top of PrivateGPT, it leverages a Retrieval-Augmented Generation (RAG) pipeline to ensure that responses are contextually relevant to the ingested website content.

Unlike traditional chatbots that rely on external knowledge sources, ChatGSPP is fully self-contained, meaning it primarily references content from the specified website to generate responses. This ensures both accuracy and privacy while enabling organizations to provide a website-specific AI assistant.

How It Works

ChatGSPP enhances PrivateGPT by implementing a website-focused workflow with the following steps:

1️⃣ Document Ingestion for Website Content

Parses and ingests data from the target website (e.g., pages, articles, FAQs, etc).
Supports web scraping or direct document uploads to convert website content into structured text.
Uses embedding generation to transform each content section into vector embeddings for retrieval.

2️⃣ Contextual Search and Retrieval

When a user submits a query:
- The query is converted into an embedding.
- The system searches for relevant website sections in a vector database (e.g., Qdrant).
- The most relevant snippets are retrieved to ensure responses are based on the website’s actual content.

3️⃣ Enhanced Prompt Engineering

The retrieved information is formatted into a structured prompt.
This prompt is fed into the LLM (e.g., Llama3.1, DeepSeek R1, Gemma 2) to generate a response.
Responses remain accurate, domain-specific, and context-aware.

4️⃣ PrivateGPT-Powered Response Generation

ChatGSPP leverages PrivateGPT’s API to generate responses while ensuring privacy.
The LLM processes the contextual prompt and generates a coherent, website-specific answer.
The response is displayed in the chatbot interface.

5️⃣ Web Chatbot Interface

ChatGSPP features a user-friendly web-based chat interface.
The chatbot interacts with the PrivateGPT-powered backend to handle queries dynamically.

Why Use ChatGSPP?

Website-Specific Answers → Responses are strictly based on website content, reducing misinformation.
Dynamic Updates → As the website evolves, new content is ingested and indexed.
Privacy-Focused → Runs locally or in a controlled environment, ensuring sensitive website data is never shared externally.
Customizable → Swap out LLMs (e.g., LlamaCPP, OpenAI, GPT4All) and vector databases as needed.
Scalable & Efficient → Uses dependency injection and modular abstractions to make component swapping and upgrades seamless.
Open-Source Support → Under the Apache 2.0 license, continuing the tradition of open innovation and collaboration.

🏗 Architecture Overview

ChatGSPP builds upon PrivateGPT’s RAG pipeline while tailoring it to website-based question answering.

Core Components:

API Layer (FastAPI)
- Manages user requests via a router-service architecture.
- Interfaces with the RAG pipeline and document embeddings.
RAG Pipeline
- Uses LlamaIndex abstractions for document ingestion, search, and prompt engineering.
- Retrieves relevant website content based on user queries.
Component Abstractions
- LLM → Supports models like LlamaCPP or OpenAI.
- BaseEmbedding → Generates vector embeddings from website content.
- VectorStore → Stores and retrieves document embeddings (e.g., Qdrant, FAISS).
Dependency Injection
- Allows swapping out LLMs, embedding models, and vector stores without modifying core logic.
Web Chat Interface
- Provides a frontend for users to ask questions and receive real-time, website-specific answers.

🚀 Getting Started

Reference the official PrivateGPT docs to get started with a fully local PrivateGPT setup. From there, you can reference chatGSPP's build docs to replicate chatGSPP's configuration on your home computer.

📜 License

ChatGSPP is open-source and licensed under the Apache 2.0 license. Feel free to use, modify, and contribute!

🙌 Contributions & Support

Got ideas? Found a bug? Feel free to submit an issue or a pull request!

GitHub Issues: Report Bugs
Discussions: Join the Community

Partners & Supporters

PrivateGPT is actively supported by the teams behind:

Qdrant: Providing the default vector database.
Fern: Offering Documentation and SDKs.
LlamaIndex: Supplying the base RAG framework and abstractions.

This project is strongly influenced and supported by other amazing projects such as:

Contact

For any questions, suggestions, or contributions, please reach out to GSPP's IT Team:

Email: [email protected]
GitHub Issues: ChatGSPP Issues

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.docker		.docker
.github		.github
fern		fern
images		images
milvus-config		milvus-config
models		models
private_gpt		private_gpt
scripts		scripts
tests		tests
tiktoken_cache		tiktoken_cache
utils		utils
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.tool-versions		.tool-versions
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
Dockerfile.llamacpp-cpu		Dockerfile.llamacpp-cpu
Dockerfile.ollama		Dockerfile.ollama
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yaml		docker-compose.yaml
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
settings-azopenai.yaml		settings-azopenai.yaml
settings-docker.yaml		settings-docker.yaml
settings-gemini.yaml		settings-gemini.yaml
settings-local.yaml		settings-local.yaml
settings-mock.yaml		settings-mock.yaml
settings-ollama-pg.yaml		settings-ollama-pg.yaml
settings-ollama.yaml		settings-ollama.yaml
settings-openai.yaml		settings-openai.yaml
settings-sagemaker.yaml		settings-sagemaker.yaml
settings-test.yaml		settings-test.yaml
settings-vllm.yaml		settings-vllm.yaml
settings.yaml		settings.yaml
version.txt		version.txt
web-scraper.py		web-scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ChatGSPP

Overview

How It Works

1️⃣ Document Ingestion for Website Content

2️⃣ Contextual Search and Retrieval

3️⃣ Enhanced Prompt Engineering

4️⃣ PrivateGPT-Powered Response Generation

5️⃣ Web Chatbot Interface

Why Use ChatGSPP?

🏗 Architecture Overview

Core Components:

🚀 Getting Started

📜 License

🙌 Contributions & Support

Partners & Supporters

Contact

About

Uh oh!

Releases

Packages

Languages

License

wilsones-berkeley/chatGSPP

Folders and files

Latest commit

History

Repository files navigation

ChatGSPP

Overview

How It Works

1️⃣ Document Ingestion for Website Content

2️⃣ Contextual Search and Retrieval

3️⃣ Enhanced Prompt Engineering

4️⃣ PrivateGPT-Powered Response Generation

5️⃣ Web Chatbot Interface

Why Use ChatGSPP?

🏗 Architecture Overview

Core Components:

🚀 Getting Started

📜 License

🙌 Contributions & Support

Partners & Supporters

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages