diff --git a/README.md b/README.md index 0b8dcaa..5887181 100644 --- a/README.md +++ b/README.md @@ -45,8 +45,9 @@ - [4. RoadMap](#4-roadmap) - [5. Contributions](#5-contributions) - [6. Acknowledgements](#6-acknowledgements) - - [7、Licence](#7licence) - - [8、Contact Information](#8contact-information) + - [7. Citation](#7-citation) + - [8. Licence](#8-licence) + - [9. Contact Information](#9-contact-information) ## 1. What is DB-GPT-Hub @@ -89,9 +90,12 @@ DB-GPT-Hub currently supports the following base models: - [x] Qwen - [x] XVERSE - [x] ChatGLM2 + - [x] ChatGLM3 - [x] internlm + + The model is fine-tuned based on a quantization bit of 4 using Quantized Learning over Redundant Architecture (QLoRA). The minimum hardware requirements for this can be referred to as follows: | Model Parameters | GPU RAM | CPU RAM | DISK | @@ -181,6 +185,7 @@ In the script, during fine-tuning, different models correspond to key parameters | [Qwen](https://github.com/QwenLM/Qwen-7B) | c_attn | chatml | | [XVERSE](https://github.com/xverse-ai/XVERSE-13B) | q_proj,v_proj | xverse | | [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | query_key_value | chatglm2 | +| [ChatGLM3](https://github.com/THUDM/ChatGLM3-6B) | query_key_value | chatglm3 | | [LLaMA](https://github.com/facebookresearch/llama) | q_proj,v_proj | - | | [BLOOM](https://huggingface.co/bigscience/bloom) | query_key_value | - | | [BLOOMZ](https://huggingface.co/bigscience/bloomz) | query_key_value | - | @@ -210,7 +215,7 @@ poetry run sh ./dbgpt_hub/scripts/predict_sft.sh ``` In the script, by default with the parameter `--quantization_bit`, it predicts using QLoRA. Removing it switches to the LoRA prediction method. -The value of the parameter `--predicted_out_filename` is the file name of the model's predicted results, which can be found in the `dbgpt_hub/output/pred` directory. +The value of the parameter `predicted_input_filename` is your predict test dataset file. `--predicted_out_filename` is the file name of the model's predicted results. ### 3.5 Model Weights You can find the second corresponding model weights from Huggingface [hg-eosphoros-ai @@ -250,6 +255,7 @@ The whole process we will divide into three phases: - [x] Qwen - [x] XVERSE - [x] ChatGLM2 + - [x] ChatGLM3 - [x] internlm * Stage 2: diff --git a/README.zh.md b/README.zh.md index f29666f..4b49606 100644 --- a/README.zh.md +++ b/README.zh.md @@ -44,8 +44,9 @@ - [四、发展路线](#四发展路线) - [五、贡献](#五贡献) - [六、感谢](#六感谢) - - [七、Licence](#七licence) - - [八、Contact Information](#八contact-information) + - [七、引用](#七引用) + - [八、Licence](#八licence) + - [九、我们的联系方式](#九我们的联系方式) ## 一、简介 @@ -84,6 +85,7 @@ DB-GPT-HUB目前已经支持的base模型有: - [x] Qwen - [x] XVERSE - [x] ChatGLM2 + - [x] ChatGLM3 - [x] internlm - [x] Falcon @@ -200,7 +202,7 @@ deepspeed --num_gpus 2 dbgpt_hub/train/sft_train.py \ poetry run sh ./dbgpt_hub/scripts/predict_sft.sh ``` 脚本中默认带着参数`--quantization_bit `为QLoRA的预测,去掉即为LoRA的预测方式。 -其中参数 `--predicted_out_filename` 的值为模型预测的结果文件名,结果在`dbgpt_hub/output/pred`目录下可以找到。 +其中参数`predicted_input_filename` 为要预测的数据集文件, `--predicted_out_filename` 的值为模型预测的结果文件名。默认结果保存在`dbgpt_hub/output/pred`目录。 ### 3.5、模型权重 @@ -237,6 +239,7 @@ poetry run python dbgpt_hub/eval/evaluation.py --plug_value --input Your_model_ - [x] Qwen - [x] XVERSE - [x] ChatGLM2 + - [x] ChatGLM3 - [x] internlm * 阶段二: