Skip to content

Commit

Permalink
docs:add about chatglm3 and the predict description
Browse files Browse the repository at this point in the history
  • Loading branch information
wangzaistone committed Dec 7, 2023
1 parent f26c22a commit 69e8275
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 6 deletions.
12 changes: 9 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,9 @@
- [4. RoadMap](#4-roadmap)
- [5. Contributions](#5-contributions)
- [6. Acknowledgements](#6-acknowledgements)
- [7、Licence](#7licence)
- [8、Contact Information](#8contact-information)
- [7. Citation](#7-citation)
- [8. Licence](#8-licence)
- [9. Contact Information](#9-contact-information)

## 1. What is DB-GPT-Hub

Expand Down Expand Up @@ -89,9 +90,12 @@ DB-GPT-Hub currently supports the following base models:
- [x] Qwen
- [x] XVERSE
- [x] ChatGLM2
- [x] ChatGLM3
- [x] internlm




The model is fine-tuned based on a quantization bit of 4 using Quantized Learning over Redundant Architecture (QLoRA). The minimum hardware requirements for this can be referred to as follows:

| Model Parameters | GPU RAM | CPU RAM | DISK |
Expand Down Expand Up @@ -181,6 +185,7 @@ In the script, during fine-tuning, different models correspond to key parameters
| [Qwen](https://github.com/QwenLM/Qwen-7B) | c_attn | chatml |
| [XVERSE](https://github.com/xverse-ai/XVERSE-13B) | q_proj,v_proj | xverse |
| [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | query_key_value | chatglm2 |
| [ChatGLM3](https://github.com/THUDM/ChatGLM3-6B) | query_key_value | chatglm3 |
| [LLaMA](https://github.com/facebookresearch/llama) | q_proj,v_proj | - |
| [BLOOM](https://huggingface.co/bigscience/bloom) | query_key_value | - |
| [BLOOMZ](https://huggingface.co/bigscience/bloomz) | query_key_value | - |
Expand Down Expand Up @@ -210,7 +215,7 @@ poetry run sh ./dbgpt_hub/scripts/predict_sft.sh
```

In the script, by default with the parameter `--quantization_bit`, it predicts using QLoRA. Removing it switches to the LoRA prediction method.
The value of the parameter `--predicted_out_filename` is the file name of the model's predicted results, which can be found in the `dbgpt_hub/output/pred` directory.
The value of the parameter `predicted_input_filename` is your predict test dataset file. `--predicted_out_filename` is the file name of the model's predicted results.

### 3.5 Model Weights
You can find the second corresponding model weights from Huggingface [hg-eosphoros-ai
Expand Down Expand Up @@ -250,6 +255,7 @@ The whole process we will divide into three phases:
- [x] Qwen
- [x] XVERSE
- [x] ChatGLM2
- [x] ChatGLM3
- [x] internlm

* Stage 2:
Expand Down
9 changes: 6 additions & 3 deletions README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,9 @@
- [四、发展路线](#四发展路线)
- [五、贡献](#五贡献)
- [六、感谢](#六感谢)
- [七、Licence](#七licence)
- [八、Contact Information](#八contact-information)
- [七、引用](#七引用)
- [八、Licence](#八licence)
- [九、我们的联系方式](#九我们的联系方式)

## 一、简介

Expand Down Expand Up @@ -84,6 +85,7 @@ DB-GPT-HUB目前已经支持的base模型有:
- [x] Qwen
- [x] XVERSE
- [x] ChatGLM2
- [x] ChatGLM3
- [x] internlm
- [x] Falcon

Expand Down Expand Up @@ -200,7 +202,7 @@ deepspeed --num_gpus 2 dbgpt_hub/train/sft_train.py \
poetry run sh ./dbgpt_hub/scripts/predict_sft.sh
```
脚本中默认带着参数`--quantization_bit `为QLoRA的预测,去掉即为LoRA的预测方式。
其中参数 `--predicted_out_filename` 的值为模型预测的结果文件名,结果在`dbgpt_hub/output/pred`目录下可以找到
其中参数`predicted_input_filename` 为要预测的数据集文件, `--predicted_out_filename` 的值为模型预测的结果文件名。默认结果保存在`dbgpt_hub/output/pred`目录


### 3.5、模型权重
Expand Down Expand Up @@ -237,6 +239,7 @@ poetry run python dbgpt_hub/eval/evaluation.py --plug_value --input Your_model_
- [x] Qwen
- [x] XVERSE
- [x] ChatGLM2
- [x] ChatGLM3
- [x] internlm

* 阶段二:
Expand Down

0 comments on commit 69e8275

Please sign in to comment.