Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some fixes in Chapter 7 #75

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 18 additions & 15 deletions Chinese_Version/ch_7_Finetune/7_1_Finetune_Llama2-7B.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ source /opt/intel/oneapi/setvars.sh
对于英特尔 GPU,您应在`from_pretrained`函数中特别设置 `optimize_model=False`。一旦获得低精度模型,请将其设置为`to('xpu')`。

```python
import torch
model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path = "meta-llama/Llama-2-7b-hf",
load_in_low_bit="nf4",
optimize_model=False,
Expand Down Expand Up @@ -89,34 +90,34 @@ model = get_peft_model(model, config)
>
> 有关 LoraConfig 参数的更多说明可以在 [Transformer LoRA 指南](https://huggingface.co/docs/peft/conceptual_guides/lora#common-lora-parameters-in-peft)中查看。

### 7.1.2.3 加载数据集
### 7.1.2.3 加载Tokenizer

我们加载通用数据集 [english quotes](https://huggingface.co/datasets/Abirate/english_quotes) 来根据英语名言来微调我们的模型
分词器可以在 LLM 训练和推理中实现分词和去分词过程。您可以使用 [Huggingface Transformers](https://huggingface.co/docs/transformers/index) API来加载 LLM 推理需要的分词器,它可以与 BigDL-LLM 加载的模型无缝配合使用。对于Llama 2,对应的tokenizer类为`LlamaTokenizer`

```python
from datasets import load_dataset
data = load_dataset("Abirate/english_quotes")
data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True)
from transformers import LlamaTokenizer
tokenizer = LlamaTokenizer.from_pretrained(pretrained_model_name_or_path="meta-llama/Llama-2-7b-chat-hf", trust_remote_code=True)
tokenizer.pad_token_id = 0
tokenizer.padding_side = "left"
```

> **注意**
>
> 如果您已经从 [Abirate/english_quotes](https://huggingface.co/datasets/Abirate/english_quotes/blob/main/quotes.jsonl) 下载了 `.jsonl` 文件,您可以使用 `data = load_dataset( "json", data_files= "path/to/your/.jsonl/file")` 指定本地路径,以替代从 huggingface repo id 的加载方法 `data = load_dataset("Abirate/english_quotes")`
> 如果您已经下载了 Llama 2 (7B) 模型,您可以将 `pretrained_model_name_or_path` 指定为本地模型路径

### 7.1.2.4 加载Tokenizer
### 7.1.2.4 加载数据集

分词器可以在 LLM 训练和推理中实现分词和去分词过程。您可以使用 [Huggingface Transformers](https://huggingface.co/docs/transformers/index) API来加载 LLM 推理需要的分词器,它可以与 BigDL-LLM 加载的模型无缝配合使用。对于Llama 2,对应的tokenizer类为`LlamaTokenizer`
我们加载通用数据集 [english quotes](https://huggingface.co/datasets/Abirate/english_quotes) 来根据英语名言来微调我们的模型

```python
from transformers import LlamaTokenizer
tokenizer = LlamaTokenizer.from_pretrained(pretrained_model_name_or_path="meta-llama/Llama-2-7b-chat-hf", trust_remote_code=True)
tokenizer.pad_token_id = 0
tokenizer.padding_side = "left"
from datasets import load_dataset
data = load_dataset("Abirate/english_quotes")
data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True)
```

> **注意**
>
> 如果您已经下载了 Llama 2 (7B) 模型,您可以将 `pretrained_model_name_or_path` 指定为本地模型路径
> 如果您已经从 [Abirate/english_quotes](https://huggingface.co/datasets/Abirate/english_quotes/blob/main/quotes.jsonl) 下载了 `.jsonl` 文件,您可以使用 `data = load_dataset( "json", data_files= "path/to/your/.jsonl/file")` 指定本地路径,以替代从 huggingface repo id 的加载方法 `data = load_dataset("Abirate/english_quotes")`

### 7.1.2.5 进行训练

Expand Down Expand Up @@ -175,8 +176,10 @@ result = trainer.train()
### 7.1.3.1 加载预训练模型

```python
from bigdl.llm.transformers import AutoModelForCausalLM
base_model_path = "meta-llama/Llama-2-7b-hf"
base_model = AutoModelForCausalLM.from_pretrained(
base_model,
base_model_path,
torch_dtype=torch.float16,
device_map={"": "cpu"},
)
Expand Down Expand Up @@ -234,7 +237,7 @@ Using pad_token, but it is not set yet.
最后,我们可以将合并的模型保存在指定的本地路径中(在我们的例子中是`./outputs/checkpoint-200-merged`)。

```python
output_path = ./outputs/checkpoint-200-merged
output_path = "./outputs/checkpoint-200-merged"
lora_model_sd = lora_model.state_dict()
deloreanized_sd = {
k.replace("base_model.model.", ""): v
Expand Down
34 changes: 18 additions & 16 deletions ch_7_Finetune/7_1_Finetune_Llama2-7B.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ With BigDL-LLM optimization, you can load the model with `bigdl.llm.transformers
For Intel GPUs, once you have the model in low precision, **set it to `to('xpu')`**.

```python
import torch
from bigdl.llm.transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path = "meta-llama/Llama-2-7b-hf",
load_in_low_bit="nf4",
Expand Down Expand Up @@ -93,7 +94,20 @@ model = get_peft_model(model, config)
> More explanation about `LoraConfig` parameters can be found in [Transformer LoRA Guides](https://huggingface.co/docs/peft/conceptual_guides/lora#common-lora-parameters-in-peft).
>

### 7.1.2.3 Load Dataset
### 7.1.2.3 Load Tokenizer
A tokenizer enables tokenizing and detokenizing process in LLM training and inference. You can use [Huggingface transformers](https://huggingface.co/docs/transformers/index) API to load the tokenizer directly. It can be used seamlessly with models loaded by BigDL-LLM. For Llama 2, the corresponding tokenizer class is `LlamaTokenizer`.

```python
from transformers import LlamaTokenizer
tokenizer = LlamaTokenizer.from_pretrained(pretrained_model_name_or_path="meta-llama/Llama-2-7b-chat-hf", trust_remote_code=True)
tokenizer.pad_token_id = 0
tokenizer.padding_side = "left"
```
> **Note**
>
> If you have already downloaded the Llama 2 (7B) model, you could specify `pretrained_model_name_or_path` to the local model path.

### 7.1.2.4 Load Dataset

A common dataset, [english quotes](https://huggingface.co/datasets/Abirate/english_quotes), is loaded to fine tune our model on famous quotes.
```python
Expand All @@ -107,19 +121,6 @@ data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True)
> The dataset path here is default to be Huggingface repo id.
> If you have already downloaded the `.jsonl` file from [Abirate/english_quotes](https://huggingface.co/datasets/Abirate/english_quotes/blob/main/quotes.jsonl), you could use `data = load_dataset("json", data_files= "path/to/your/.jsonl/file")` to specify the local path instead of `data = load_dataset("Abirate/english_quotes")`.

### 7.1.2.4 Load Tokenizer
A tokenizer enables tokenizing and detokenizing process in LLM training and inference. You can use [Huggingface transformers](https://huggingface.co/docs/transformers/index) API to load the tokenizer directly. It can be used seamlessly with models loaded by BigDL-LLM. For Llama 2, the corresponding tokenizer class is `LlamaTokenizer`.

```python
from transformers import LlamaTokenizer
tokenizer = LlamaTokenizer.from_pretrained(pretrained_model_name_or_path="meta-llama/Llama-2-7b-chat-hf", trust_remote_code=True)
tokenizer.pad_token_id = 0
tokenizer.padding_side = "left"
```
> **Note**
>
> If you have already downloaded the Llama 2 (7B) model, you could specify `pretrained_model_name_or_path` to the local model path.

### 7.1.2.5 Run the Training

You can then start the training process by setting the `trainer` with existing tools on the HF ecosystem. Here we set `warmup_steps` to be 20 to accelerate the process of training.
Expand Down Expand Up @@ -177,8 +178,9 @@ After finetuning the model, you could merge the QLoRA weights back into the base

```python
from bigdl.llm.transformers import AutoModelForCausalLM
base_model_path = "meta-llama/Llama-2-7b-hf"
base_model = AutoModelForCausalLM.from_pretrained(
base_model,
base_model_path,
torch_dtype=torch.float16,
device_map={"": "cpu"},
)
Expand Down Expand Up @@ -238,7 +240,7 @@ Using pad_token, but it is not set yet.
```
Finally we can save the fine-tuned model in a specified local path (in our case is `./outputs/checkpoint-200-merged`).
```python
output_path = ./outputs/checkpoint-200-merged
output_path = "./outputs/checkpoint-200-merged"
lora_model_sd = lora_model.state_dict()
deloreanized_sd = {
k.replace("base_model.model.", ""): v
Expand Down