MageSQL: multi-agent text2sql framework (CIKM 2024 demo)

Repository Structure

Here is a overivew of the repository's structure related to the demo.

├── agents                                  # Classes of agents
│   ├── base_agent.py                       # Abstract agent class
│   └── <agent>.py                          # Implementation class of agent
├── database_routing                        # training scripts for database routing
├── dataset_classes                         # Classes for datasets
├── demonstration_selector                  # Classes for demonstration selectors
├── demo_paper                              # Code for the frontend and backend
│   ├── backend                             # Code for the backend
│   │   ├── agent_center.py                 # Manage & handle agents
│   │   ├── gold_sql_retrieval.py          # LM Embedding based gold sql retrieval
│   │   ├── main.py                         # FastAPI backend 
│   │   └── requirements.txt                # Python packages for the backend
│   ├── public                              # npm resources
│   ├── src                                 # source code for the frontend
│   └── packages.json                       # npm packages.
├── utils                                   # Utility functions
└── README.md                               # The file you are reading now.

Clone the demo_paper branch of this repo

git clone https://github.com/megagonlabs/magesql.git
cd magesql
git checkout -b main
git pull origin main

Environment setup (Backend)

Install packages

conda create -n magesql_demo python=3.10
conda activate magesql_demo 
python -m pip install --upgrade pip
pip install -r demo_paper/backend/requirements.txt

Activate the conda environment

conda activate magesql_demo

Download punkt tokenizer

python -m nltk.downloader punkt
python -m nltk.downloader punkt_tab

Set the Python path with the root folder of this repo

export PYTHONPATH=<path_of_this_repo>

If you are already in the root folder of this repo, you can use the following command to set the Python path

export PYTHONPATH=$PYTHONPATH:$(pwd)

Data Dependancy (Backend)

Spider Dataset You need to download the dataset from Spider, and put it at path <repo_path>/datasets/spider directory.

mkdir datasets/
cd datasets/
gdown 1403EGqzIDoHMdQF4c9Bkyl7dZLZ5Wt6J
unzip spider_data.zip
mv spider_data/ spider/

Insider the spider folder, you will need to merge the train_spider.json and train_others.json in the dataset into a train_spider_and_others.json file, and store in the same folder. The pre-combined file could also be directly downloaded from Megagon Shared Drive. Example commands to copy the pre-combined file to the spider folder.

cd spider # if you followed the above steps, or you can directly cd to the spider folder
cp <your_downloaded_files_directory>/train_spider_and_others.json .

DB Execution agent Copy the database folders in <repo_path>/datasets/spider/database and database folders in <repo_path>/datasets/spider/test_database into directory <repo_path>/datasets/spider/database_all_splits/

cd <repo_path>/datasets/spider # skip this step if you are already in the spider folder
mkdir database_all_splits
cp -r database/* database_all_splits/
cp -r test_database/* database_all_splits/

Schema Fetching agent Download the schema file from Megagon Shared Drive, and move it to directory <repo_path>/datasets/spider/

cd <repo_path>/datasets/spider # skip this step if you are already in the spider folder
cp <your_downloaded_files_directory>/db_id2schema_text.json .

Gold SQL Retrieval Download the gold SQL file from Megagon Shared Drive and question embedding file from Megagon Shared Drive, and move it to directory <repo_path>/datasets/spider/

cd <repo_path>/datasets/spider # skip this step if you are already in the spider folder
cp <your_downloaded_files_directory>/question2sql.json .
cp <your_downloaded_files_directory>/question_embeddings.pt .

Trained database routing model Download the trained database routing model from Megagon Shared Drive, move the downloaded file database_routing_spider_v1.zip to directory <repo_path>/database_routing/saved_models/ and unzip it.

cd <repo_path>/
mkdir -p database_routing/saved_models/
cp <your_downloaded_files_directory>/database_routing_spider_v1.zip .
unzip database_routing_spider_v1.zip

Environment setup (Frontend)

Before install the dependencies, you need to install npm according to npm's official website. Example command to install by 'nvm'

# Install nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
source ~/.bashrc

export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"  # This loads nvm
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion"  # This loads nvm bash_completion

nvm install 20.11.1

Now you can install the dependencies

cd <repo_path>/demo_paper
npm install

How to Start the frontend

cd <repo_path>/demo_paper
npm start

How to Start the backend

cd <repo_path>
conda activate magesql_demo
export PYTHONPATH="$PYTHONPATH:$(pwd)"
uvicorn demo_paper.backend.main:app --reload

Change logs

V1.0: developped frontend for certain example
V1.1: developped backend and integrated with frontend
V1.2: support frontend of demonstration selection and database routing agents. A list of updates to improve user's experience.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MageSQL: multi-agent text2sql framework (CIKM 2024 demo)

Repository Structure

Clone the demo_paper branch of this repo

Environment setup (Backend)

Data Dependancy (Backend)

Environment setup (Frontend)

How to Start the frontend

How to Start the backend

Change logs

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
agents		agents
dataset_classes		dataset_classes
demo_paper		demo_paper
demonstration_selector		demonstration_selector
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

megagonlabs/magesql

Folders and files

Latest commit

History

Repository files navigation

MageSQL: multi-agent text2sql framework (CIKM 2024 demo)

Repository Structure

Clone the demo_paper branch of this repo

Environment setup (Backend)

Data Dependancy (Backend)

Environment setup (Frontend)

How to Start the frontend

How to Start the backend

Change logs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages