Merge pull request #99 from eosphoros-ai/refactor

update latest exp res
eosphoros-ai · Oct 16, 2023 · cbec2a7 · cbec2a7
2 parents 6c1b42a + 9dcaf91
commit cbec2a7
Show file tree

Hide file tree

Showing 3 changed files with 12 additions and 8 deletions.
diff --git a/README.md b/README.md
@@ -25,7 +25,9 @@ DB-GPT-Hub is an experimental project utilizing LLMs (Large Language Models) to
 
 So far, we have successfully integrated multiple large models and established a complete workflow, including data processing, model SFT (Supervised Fine-Tuning) training, prediction output, and evaluation. The code is readily reusable within this project.    
 
-As of October 10, 2023, by fine-tuning an open-source model of 13 billion parameters using this project, **the execution accuracy on the Spider evaluation dataset has surpassed that of GPT-4!**
+As of October 10, 2023, by fine-tuning an open-source model of 13 billion parameters using this project, **the execution accuracy on the Spider evaluation dataset has surpassed that of GPT-4!**  
+
+Part of the experimental results have been compiled into the [document](docs/eval_llm_result.md) in this project. By utilizing this project and combining more related data, the execution accuracy on the Spider evaluation set has already reached **0.825**.
 
 ## 2. Fine-tuning Text-to-SQL
 
@@ -204,7 +206,7 @@ Run the following command:
 ```bash
 python dbgpt_hub/eval/evaluation.py --plug_value --input Your_model_pred_file
 ```
-You can find the results of our latest review [here](docs/eval_llm_result.md)
+You can find the results of our latest review and part of experiment results [here](docs/eval_llm_result.md)
 
 ## 4. RoadMap 
 

diff --git a/README.zh.md b/README.zh.md
@@ -23,7 +23,8 @@
 
 DB-GPT-Hub是一个利用LLMs实现Text-to-SQL解析的实验项目，主要包含数据集收集、数据预处理、模型选择与构建和微调权重等步骤，通过这一系列的处理可以在提高Text-to-SQL能力的同时降低模型训练成本，让更多的开发者参与到Text-to-SQL的准确度提升工作当中，最终实现基于数据库的自动问答能力，让用户可以通过自然语言描述完成复杂数据库的查询操作等工作。     
 目前我们已经基于多个大模型打通从数据处理、模型SFT训练、预测输出和评估的整个流程，**代码在本项目中均可以直接复用**。   
-截止20231010，我们利用本项目基于开源的13B大小的模型微调后，在Spider的评估集上的执行准确率，**已经超越GPT-4!**   
+截止20231010，我们利用本项目基于开源的13B大小的模型微调后，在Spider的评估集上的执行准确率，**已经超越GPT-4!**    
+部分实验结果已汇总到了本项目的相关[文档](docs/eval_llm_result.md) ，利用本项目结合更多相关数据在Spider评估集上的执行准确率已经可以达到**0.825**. 
 
 ## 二、Text-to-SQL微调
 
@@ -190,7 +191,7 @@ sh ./dbgpt_hub/scripts/export_merge.sh
 ```bash
 python dbgpt_hub/eval/evaluation.py --plug_value --input  Your_model_pred_file
 ```
-你可以在[这里](docs/eval_llm_result.md)找到我们最新的评估结果。
+你可以在[这里](docs/eval_llm_result.md)找到我们最新的评估和实验结果。
 ## 四、发展路线
 
 整个过程我们会分为三个阶段：

diff --git a/docs/eval_llm_result.md b/docs/eval_llm_result.md
@@ -7,13 +7,14 @@ This doc aims to summarize the performance of publicly available big language mo
 | ------------------------------ | ------------------ | ---------------------------------------------------------------------------------- |
 | **GPT-4**                         | **0.762**              | [numbersstation-eval-res](https://www.numbersstation.ai/post/nsql-llama-2-7b)    |
 | ChatGPT                        | 0.728              | [numbersstation-eval-res](https://www.numbersstation.ai/post/nsql-llama-2-7b)| 
-| **CodeLlama-13b-Instruct-hf_lora** | **0.789**              | sft train by our this project,only used spider train dataset ,the same eval  way in this project  with lora SFT |
-| CodeLlama-13b-Instruct-hf_qlora | 0.774              | sft train by our this project,only used spider train dataset ,the same eval  way in this project  with qlora and nf4,bit4 SFT |
+| **CodeLlama-13b-Instruct-hf_lora** | **0.789**              | sft train by our this project,only used spider train dataset ,the same eval  way in this project  with lora SFT. |
+| **CodeLlama-13b-Instruct-hf_qlora** | **0.825**              | sft train by our this project, used around 50 thousand pieces of text-to-sql data. ,the same eval  way in this project  with lora SFT,and we make sure the training set has filtered the spider eval dataset. |
+| CodeLlama-13b-Instruct-hf_qlora | 0.774              | sft train by our this project,only used spider train dataset ,the same eval  way in this project  with qlora and nf4,bit4 SFT.|
 | wizardcoder                    | 0.610              | [text-to-sql-wizardcoder](https://github.com/cuplv/text-to-sql-wizardcoder/tree/main)                |  
 |CodeLlama-13b-Instruct-hf| 0.556 | eval in this project default param|
 |Baichuan2-13B-Chat|0.392|  eval in this project default param|
-| llama2_13b_hf                  | xxx            | run in this project,default param set   |
-| llama2_13b_hf_lora_best             | 0.744           | sft train by our this project,only used spider train dataset ,the same eval  way in this project |
+| llama2_13b_hf                  | xxx            | run in this project,default param set.   |
+| llama2_13b_hf_lora_best             | 0.744           | sft train by our this project,only used spider train dataset ,the same eval  way in this project. |