diff --git a/Chinese_Version/ch_7_Finetune/7_1_Finetune_Llama2-7B.md b/Chinese_Version/ch_7_Finetune/7_1_Finetune_Llama2-7B.md index 4509db3..1fca089 100644 --- a/Chinese_Version/ch_7_Finetune/7_1_Finetune_Llama2-7B.md +++ b/Chinese_Version/ch_7_Finetune/7_1_Finetune_Llama2-7B.md @@ -45,6 +45,7 @@ source /opt/intel/oneapi/setvars.sh 对于英特尔 GPU,您应在`from_pretrained`函数中特别设置 `optimize_model=False`。一旦获得低精度模型,请将其设置为`to('xpu')`。 ```python +import torch model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path = "meta-llama/Llama-2-7b-hf", load_in_low_bit="nf4", optimize_model=False, @@ -89,34 +90,34 @@ model = get_peft_model(model, config) > > 有关 LoraConfig 参数的更多说明可以在 [Transformer LoRA 指南](https://huggingface.co/docs/peft/conceptual_guides/lora#common-lora-parameters-in-peft)中查看。 -### 7.1.2.3 加载数据集 +### 7.1.2.3 加载Tokenizer -我们加载通用数据集 [english quotes](https://huggingface.co/datasets/Abirate/english_quotes) 来根据英语名言来微调我们的模型。 +分词器可以在 LLM 训练和推理中实现分词和去分词过程。您可以使用 [Huggingface Transformers](https://huggingface.co/docs/transformers/index) API来加载 LLM 推理需要的分词器,它可以与 BigDL-LLM 加载的模型无缝配合使用。对于Llama 2,对应的tokenizer类为`LlamaTokenizer`。 ```python -from datasets import load_dataset -data = load_dataset("Abirate/english_quotes") -data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True) +from transformers import LlamaTokenizer +tokenizer = LlamaTokenizer.from_pretrained(pretrained_model_name_or_path="meta-llama/Llama-2-7b-chat-hf", trust_remote_code=True) +tokenizer.pad_token_id = 0 +tokenizer.padding_side = "left" ``` > **注意** > -> 如果您已经从 [Abirate/english_quotes](https://huggingface.co/datasets/Abirate/english_quotes/blob/main/quotes.jsonl) 下载了 `.jsonl` 文件,您可以使用 `data = load_dataset( "json", data_files= "path/to/your/.jsonl/file")` 指定本地路径,以替代从 huggingface repo id 的加载方法 `data = load_dataset("Abirate/english_quotes")`。 +> 如果您已经下载了 Llama 2 (7B) 模型,您可以将 `pretrained_model_name_or_path` 指定为本地模型路径。 -### 7.1.2.4 加载Tokenizer +### 7.1.2.4 加载数据集 -分词器可以在 LLM 训练和推理中实现分词和去分词过程。您可以使用 [Huggingface Transformers](https://huggingface.co/docs/transformers/index) API来加载 LLM 推理需要的分词器,它可以与 BigDL-LLM 加载的模型无缝配合使用。对于Llama 2,对应的tokenizer类为`LlamaTokenizer`。 +我们加载通用数据集 [english quotes](https://huggingface.co/datasets/Abirate/english_quotes) 来根据英语名言来微调我们的模型。 ```python -from transformers import LlamaTokenizer -tokenizer = LlamaTokenizer.from_pretrained(pretrained_model_name_or_path="meta-llama/Llama-2-7b-chat-hf", trust_remote_code=True) -tokenizer.pad_token_id = 0 -tokenizer.padding_side = "left" +from datasets import load_dataset +data = load_dataset("Abirate/english_quotes") +data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True) ``` > **注意** > -> 如果您已经下载了 Llama 2 (7B) 模型,您可以将 `pretrained_model_name_or_path` 指定为本地模型路径。 +> 如果您已经从 [Abirate/english_quotes](https://huggingface.co/datasets/Abirate/english_quotes/blob/main/quotes.jsonl) 下载了 `.jsonl` 文件,您可以使用 `data = load_dataset( "json", data_files= "path/to/your/.jsonl/file")` 指定本地路径,以替代从 huggingface repo id 的加载方法 `data = load_dataset("Abirate/english_quotes")`。 ### 7.1.2.5 进行训练 @@ -175,8 +176,10 @@ result = trainer.train() ### 7.1.3.1 加载预训练模型 ```python +from bigdl.llm.transformers import AutoModelForCausalLM +base_model_path = "meta-llama/Llama-2-7b-hf" base_model = AutoModelForCausalLM.from_pretrained( - base_model, + base_model_path, torch_dtype=torch.float16, device_map={"": "cpu"}, ) @@ -234,7 +237,7 @@ Using pad_token, but it is not set yet. 最后,我们可以将合并的模型保存在指定的本地路径中(在我们的例子中是`./outputs/checkpoint-200-merged`)。 ```python -output_path = ./outputs/checkpoint-200-merged +output_path = "./outputs/checkpoint-200-merged" lora_model_sd = lora_model.state_dict() deloreanized_sd = { k.replace("base_model.model.", ""): v diff --git a/ch_7_Finetune/7_1_Finetune_Llama2-7B.md b/ch_7_Finetune/7_1_Finetune_Llama2-7B.md index faf6377..5c4ee5d 100644 --- a/ch_7_Finetune/7_1_Finetune_Llama2-7B.md +++ b/ch_7_Finetune/7_1_Finetune_Llama2-7B.md @@ -46,6 +46,7 @@ With BigDL-LLM optimization, you can load the model with `bigdl.llm.transformers For Intel GPUs, once you have the model in low precision, **set it to `to('xpu')`**. ```python +import torch from bigdl.llm.transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path = "meta-llama/Llama-2-7b-hf", load_in_low_bit="nf4", @@ -93,7 +94,20 @@ model = get_peft_model(model, config) > More explanation about `LoraConfig` parameters can be found in [Transformer LoRA Guides](https://huggingface.co/docs/peft/conceptual_guides/lora#common-lora-parameters-in-peft). > -### 7.1.2.3 Load Dataset +### 7.1.2.3 Load Tokenizer +A tokenizer enables tokenizing and detokenizing process in LLM training and inference. You can use [Huggingface transformers](https://huggingface.co/docs/transformers/index) API to load the tokenizer directly. It can be used seamlessly with models loaded by BigDL-LLM. For Llama 2, the corresponding tokenizer class is `LlamaTokenizer`. + +```python +from transformers import LlamaTokenizer +tokenizer = LlamaTokenizer.from_pretrained(pretrained_model_name_or_path="meta-llama/Llama-2-7b-chat-hf", trust_remote_code=True) +tokenizer.pad_token_id = 0 +tokenizer.padding_side = "left" +``` +> **Note** +> +> If you have already downloaded the Llama 2 (7B) model, you could specify `pretrained_model_name_or_path` to the local model path. + +### 7.1.2.4 Load Dataset A common dataset, [english quotes](https://huggingface.co/datasets/Abirate/english_quotes), is loaded to fine tune our model on famous quotes. ```python @@ -107,19 +121,6 @@ data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True) > The dataset path here is default to be Huggingface repo id. > If you have already downloaded the `.jsonl` file from [Abirate/english_quotes](https://huggingface.co/datasets/Abirate/english_quotes/blob/main/quotes.jsonl), you could use `data = load_dataset("json", data_files= "path/to/your/.jsonl/file")` to specify the local path instead of `data = load_dataset("Abirate/english_quotes")`. -### 7.1.2.4 Load Tokenizer -A tokenizer enables tokenizing and detokenizing process in LLM training and inference. You can use [Huggingface transformers](https://huggingface.co/docs/transformers/index) API to load the tokenizer directly. It can be used seamlessly with models loaded by BigDL-LLM. For Llama 2, the corresponding tokenizer class is `LlamaTokenizer`. - -```python -from transformers import LlamaTokenizer -tokenizer = LlamaTokenizer.from_pretrained(pretrained_model_name_or_path="meta-llama/Llama-2-7b-chat-hf", trust_remote_code=True) -tokenizer.pad_token_id = 0 -tokenizer.padding_side = "left" -``` -> **Note** -> -> If you have already downloaded the Llama 2 (7B) model, you could specify `pretrained_model_name_or_path` to the local model path. - ### 7.1.2.5 Run the Training You can then start the training process by setting the `trainer` with existing tools on the HF ecosystem. Here we set `warmup_steps` to be 20 to accelerate the process of training. @@ -177,8 +178,9 @@ After finetuning the model, you could merge the QLoRA weights back into the base ```python from bigdl.llm.transformers import AutoModelForCausalLM +base_model_path = "meta-llama/Llama-2-7b-hf" base_model = AutoModelForCausalLM.from_pretrained( - base_model, + base_model_path, torch_dtype=torch.float16, device_map={"": "cpu"}, ) @@ -238,7 +240,7 @@ Using pad_token, but it is not set yet. ``` Finally we can save the fine-tuned model in a specified local path (in our case is `./outputs/checkpoint-200-merged`). ```python -output_path = ./outputs/checkpoint-200-merged +output_path = "./outputs/checkpoint-200-merged" lora_model_sd = lora_model.state_dict() deloreanized_sd = { k.replace("base_model.model.", ""): v