Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

Website • Paper • Data Update Log • Submission Guidance

📰 News

2025-07-13: We update spider2-snow.jsonl to resolve ambiguities, with the previous version renamed to spider2-snow-0713.jsonl for reference.
2025-06-10: We implemented a tool-call-based Spider-Agent for Spider 2.0-Snow that requires no Docker and significantly improves runtime performance.
2025-05-22: We have created a new task setting, Spider2-DBT, and removed the original Spider2 setting. spider2-dbt consists of only 68 tasks, enabling quick and smooth benchmarking with spider-agent-dbt. It is a comprehensive, repository-level text-to-SQL task.
2025-04-20: We provide the ground-truth tables for spider2-lite and spider2-snow to help quick benchmarking and analysis. However, when using this setting, you must indicate that you are using oracle tables.
2025-01-10: Please refer to the data update log to track changes in the evaluation examples. The leaderboard results will also change dynamically accordingly.

2024-12-24: Considering the many evaluation requirements, we have decided to release all examples and gold answers for self-evaluation. However, only a small amount of gold SQL is available. The leaderboard is still active. To have your method officially validated and upload your scores to the leaderboard, please follow the submission guidance.

👋 Overview

Setting	Task Type	#Examples	Databases	Cost
Spider 2.0-Snow	Text-to-SQL task	547	Snowflake(547)	NO COST!😊
Spider 2.0-Lite	Text-to-SQL task	547	BigQuery(214), Snowflake(198), SQLite(135)	Some cost incurred
Spider 2.0-DBT	Code agent task	68	DuckDB (DBT)(68)	NO COST!😊

Data

The questions/instructions are in spider2-lite.jsonl and spider2-snow.jsonl.

We also release some gold SQLs to help users design prompts and methods, note that we do not recommend using the Spider 2.0 Gold SQL we released for fine-tuning.

🚀 Quickstart (Spider2-lite/snow)

Sign Up for Your Own BigQuery and Snowflake Accounts

To sign up for a BigQuery account, please follow this guideline, get your own credentials.
Follow this guideline and fill out this Spider2 Snowflake Access, and we will send you an account sign-up email, which will allow you to access the Snowflake database.

Important Notes:

If you want to access the FULL dataset of Spider 2.0-Lite, you must complete Step1 and Step2.
If you only want access to the FULL dataset of Spider 2.0-Snow, you only need to complete Step2.

Spider 2.0-Snow (Tool-call Format, UPDATE 2025-06-10)

A Docker-free and ultra-fast Spider-Agent implementation for rapid benchmarking of any model.

spider-agent-tool-call

Spider 2.0-Snow and Spider 2.0-Lite (Based on Docker)

We highly recommend that you directly use Spider2-snow and Spider2-lite for benchmarking and research. First, run the Spider-Agent Framework!!

For more details, please refer to the following links:

🚀 Quickstart (Spider2-dbt)

For more details, please refer to the following links:

spider2-dbt (The Data)
spider-agent-dbt (The Method)

📋 Leaderboard Submission

We only release the gold answer of about partial examples of Spider 2.0-Lite, Spider 2.0-Snow and Spider 2.0-DBT. You must follow this submission guidance to upload your score to leaderboard.

🙇‍♂️ Acknowledgement

We thank Snowflake for their generous support in hosting the Spider 2.0 Challenge. We also thank Minghang Deng, Tianbao Xie, Yiheng Xu, Fan Zhou, Yuting Lan, Per Jacobsson, Yiming Huang, Canwen Xu, Zhewei Yao, and Binyuan Hui for their helpful feedback on this work. The website and submission guidelines are greatly inspired by BIRD-SQL, and we thank them for their contributions.

✍️ Citation

If you find our work helpful, please cite as

@misc{lei2024spider2,
      title={Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows}, 
      author={Fangyu Lei and Jixuan Chen and Yuxiao Ye and Ruisheng Cao and Dongchan Shin and Hongjin Su and Zhaoqing Suo and Hongcheng Gao and Wenjing Hu and Pengcheng Yin and Victor Zhong and Caiming Xiong and Ruoxi Sun and Qian Liu and Sida Wang and Tao Yu},
      year={2024},
      eprint={2411.07763},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.07763}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 221 Commits
assets		assets
methods		methods
spider2-dbt		spider2-dbt
spider2-lite		spider2-lite
spider2-snow		spider2-snow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

📰 News

👋 Overview

Data

🚀 Quickstart (Spider2-lite/snow)

Sign Up for Your Own BigQuery and Snowflake Accounts

Spider 2.0-Snow (Tool-call Format, UPDATE 2025-06-10)

Spider 2.0-Snow and Spider 2.0-Lite (Based on Docker)

🚀 Quickstart (Spider2-dbt)

📋 Leaderboard Submission

🙇‍♂️ Acknowledgement

✍️ Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 9

Languages

License

xlang-ai/Spider2

Folders and files

Latest commit

History

Repository files navigation

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

📰 News

👋 Overview

Data

🚀 Quickstart (Spider2-lite/snow)

Sign Up for Your Own BigQuery and Snowflake Accounts

Spider 2.0-Snow (Tool-call Format, UPDATE 2025-06-10)

Spider 2.0-Snow and Spider 2.0-Lite (Based on Docker)

🚀 Quickstart (Spider2-dbt)

📋 Leaderboard Submission

🙇‍♂️ Acknowledgement

✍️ Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 9

Languages

Packages