Skip to content

Commit

Permalink
update latest exp res
Browse files Browse the repository at this point in the history
  • Loading branch information
wangzaistone committed Oct 16, 2023
1 parent c46cd24 commit 9dcaf91
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 8 deletions.
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,9 @@ DB-GPT-Hub is an experimental project utilizing LLMs (Large Language Models) to

So far, we have successfully integrated multiple large models and established a complete workflow, including data processing, model SFT (Supervised Fine-Tuning) training, prediction output, and evaluation. The code is readily reusable within this project.

As of October 10, 2023, by fine-tuning an open-source model of 13 billion parameters using this project, **the execution accuracy on the Spider evaluation dataset has surpassed that of GPT-4!**
As of October 10, 2023, by fine-tuning an open-source model of 13 billion parameters using this project, **the execution accuracy on the Spider evaluation dataset has surpassed that of GPT-4!**

Part of the experimental results have been compiled into the [document](docs/eval_llm_result.md) in this project. By utilizing this project and combining more related data, the execution accuracy on the Spider evaluation set has already reached **0.825**.

## 2. Fine-tuning Text-to-SQL

Expand Down Expand Up @@ -204,7 +206,7 @@ Run the following command:
```bash
python dbgpt_hub/eval/evaluation.py --plug_value --input Your_model_pred_file
```
You can find the results of our latest review [here](docs/eval_llm_result.md)
You can find the results of our latest review and part of experiment results [here](docs/eval_llm_result.md)

## 4. RoadMap

Expand Down
5 changes: 3 additions & 2 deletions README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@

DB-GPT-Hub是一个利用LLMs实现Text-to-SQL解析的实验项目,主要包含数据集收集、数据预处理、模型选择与构建和微调权重等步骤,通过这一系列的处理可以在提高Text-to-SQL能力的同时降低模型训练成本,让更多的开发者参与到Text-to-SQL的准确度提升工作当中,最终实现基于数据库的自动问答能力,让用户可以通过自然语言描述完成复杂数据库的查询操作等工作。
目前我们已经基于多个大模型打通从数据处理、模型SFT训练、预测输出和评估的整个流程,**代码在本项目中均可以直接复用**
截止20231010,我们利用本项目基于开源的13B大小的模型微调后,在Spider的评估集上的执行准确率,**已经超越GPT-4!**
截止20231010,我们利用本项目基于开源的13B大小的模型微调后,在Spider的评估集上的执行准确率,**已经超越GPT-4!**
部分实验结果已汇总到了本项目的相关[文档](docs/eval_llm_result.md) ,利用本项目结合更多相关数据在Spider评估集上的执行准确率已经可以达到**0.825**.

## 二、Text-to-SQL微调

Expand Down Expand Up @@ -190,7 +191,7 @@ sh ./dbgpt_hub/scripts/export_merge.sh
```bash
python dbgpt_hub/eval/evaluation.py --plug_value --input Your_model_pred_file
```
你可以在[这里](docs/eval_llm_result.md)找到我们最新的评估结果
你可以在[这里](docs/eval_llm_result.md)找到我们最新的评估和实验结果
## 四、发展路线

整个过程我们会分为三个阶段:
Expand Down
9 changes: 5 additions & 4 deletions docs/eval_llm_result.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,14 @@ This doc aims to summarize the performance of publicly available big language mo
| ------------------------------ | ------------------ | ---------------------------------------------------------------------------------- |
| **GPT-4** | **0.762** | [numbersstation-eval-res](https://www.numbersstation.ai/post/nsql-llama-2-7b) |
| ChatGPT | 0.728 | [numbersstation-eval-res](https://www.numbersstation.ai/post/nsql-llama-2-7b)|
| **CodeLlama-13b-Instruct-hf_lora** | **0.789** | sft train by our this project,only used spider train dataset ,the same eval way in this project with lora SFT |
| CodeLlama-13b-Instruct-hf_qlora | 0.774 | sft train by our this project,only used spider train dataset ,the same eval way in this project with qlora and nf4,bit4 SFT |
| **CodeLlama-13b-Instruct-hf_lora** | **0.789** | sft train by our this project,only used spider train dataset ,the same eval way in this project with lora SFT. |
| **CodeLlama-13b-Instruct-hf_qlora** | **0.825** | sft train by our this project, used around 50 thousand pieces of text-to-sql data. ,the same eval way in this project with lora SFT,and we make sure the training set has filtered the spider eval dataset. |
| CodeLlama-13b-Instruct-hf_qlora | 0.774 | sft train by our this project,only used spider train dataset ,the same eval way in this project with qlora and nf4,bit4 SFT.|
| wizardcoder | 0.610 | [text-to-sql-wizardcoder](https://github.com/cuplv/text-to-sql-wizardcoder/tree/main) |
|CodeLlama-13b-Instruct-hf| 0.556 | eval in this project default param|
|Baichuan2-13B-Chat|0.392| eval in this project default param|
| llama2_13b_hf | xxx | run in this project,default param set |
| llama2_13b_hf_lora_best | 0.744 | sft train by our this project,only used spider train dataset ,the same eval way in this project |
| llama2_13b_hf | xxx | run in this project,default param set. |
| llama2_13b_hf_lora_best | 0.744 | sft train by our this project,only used spider train dataset ,the same eval way in this project. |



Expand Down

0 comments on commit 9dcaf91

Please sign in to comment.