Chat2DB GenAI Chatbot

An LLM-powered chatbot for natural language database queries with extensive observability

Chat to your Database GenAI Chatbot

A web chatbot interface for database interactions using natural language questions through various interaction methods (RAG, TAG) with different LLMs, including comprehensive observability and tracking.

Features

Multiple interaction methods (RAG, TAG)
LLM provider selection (OpenAI, Claude)
Intent classification (Details in Classifier README)
Vector search with PGVector
Langfuse Analytics
Conversation memory (until browser refresh)
Docker-based deployment

Prerequisites

Docker and Docker Compose
Python 3.9+
OpenAI API key
Anthropic API key
Langfuse account (optional)

Installation

Clone the repository:

git clone https://github.com/garyzava/chat-to-database-chatbot.git
cd chat-to-database-chatbot

Configure environment variables: Copy .env.example to .env and fill in your API keys and configurations.
Build and Start the Docker services (one-off)

One-off command:

make run

After the installation, simply run:

make up

Or run the application in developer mode:

make dev

The developer mode installs the streamlit app locally but the databases are still installed on Docker

Shut down the application:

make down

Call the Modules Directly

Running in local mode (make dev), go to the chat2dbchatbot directory. Make sure the virtual enviroment has been activated. Open a new terminal:

cd chat2dbchatbot

Run the RAG utility

python -m tools.rag "what is the track with the most revenue" --llm OpenAI --temperature 0.1

Or run the TAG utility

python -m tools.tag "what is the track with the most revenue" --llm OpenAI --temperature 0.1

Chatbot Usage

Go to http://localhost:8501 for the main chatbot interface
Select your preferred interaction method (RAG, TAG)
Choose an LLM provider (OpenAI or Claude)
Start asking questions about your database
Go to http://localhost:3000 for the Langfuse interface when not running on dev mode

Architecture

Frontend: Streamlit
Document Parsing: Docling
Vector Database: PostgreSQL with pgvector
Observability: Langfuse
LLM Framework: LlamaIndex
Container Orchestration: Docker Compose

Paper References

RAG (Retrieval-Augmented Generation): Paper by Facebook AI
TAG (Table-Augmented Generation): Paper by UC Berkeley & Stanford University

Data Source Statement

This project uses the Chinook database, a media store database, for development and testing purposes. However, it can be easily adapted for any enterprise or domain-specific use case.

Chinook Database:
- Ownership: Maintained by lerocha
- Licenses and Use: The Chinook Database allows use, distribution, and modification without any warranty of any kind.
- Access: Available on GitHub at https://github.com/lerocha/chinook-database

The intent classifier piece uses data from the following datasets. Access to the data is subject to the respective terms:

GretelAI Synthetic Text-to-SQL:
- Ownership: Gretel.ai
- Licenses and Use: Licensed under the Apache License 2.0, permitting use, distribution, and modification with proper attribution.
- Access: Available on Hugging Face at https://huggingface.co/datasets/gretelai/synthetic_text_to_sql
Factoid WebQuestions Dataset:
- Ownership: WebQuestions (http://nlp.stanford.edu/software/sempre/ - Berant et al., 2013, CC-BY)
- Licenses and Use: Distributed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, allowing sharing and adaptation with appropriate credit.
- Access: Available on GitHub at https://github.com/brmson/dataset-factoid-webquestions

Evaluation Framework

Located under the eval sub-folder
Evaluation Framework README is here

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
chat2dbchatbot		chat2dbchatbot
db		db
eval		eval
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
chat2db.webp		chat2db.webp
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chat2DB GenAI Chatbot

Chat to your Database GenAI Chatbot

Features

Prerequisites

Installation

Call the Modules Directly

Chatbot Usage

Architecture

Paper References

Data Source Statement

Evaluation Framework

About

Releases 1

Contributors 2

Languages

License

garyzava/chat-to-database-chatbot

Folders and files

Latest commit

History

Repository files navigation

Chat2DB GenAI Chatbot

Chat to your Database GenAI Chatbot

Features

Prerequisites

Installation

Call the Modules Directly

Chatbot Usage

Architecture

Paper References

Data Source Statement

Evaluation Framework

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Contributors 2

Languages