This document provides instructions for supervised finetuning (SFT) of Yuan2.0.
An example script to run Yuan-102B SFT is:
bash examples/pretrain_yuan2.0_102B_sft.sh
Before running the script, the relevant arguments should be set correctly.
Firstly, make any desired modifications including setting the environment variables for CHECKPOINT_PATH
, DATA_PATH
, TOKENIZER_MODEL_PATH
and TENSORBOARD_PATH
.
--train-reset
allows you to begin your training iters from 0.
--sft-stage
is highly recommended to be on since it control the calculate of loss mask during SFT.
--override-opt-param-scheduler
allows you to set your own scheduler.
--finetune
load model for finetuning. do not load optimizer or rng state from checkpoint and set iters to 0. Assumed when loading a release checkpoint.
If the dataset path is:
/path/dataset.bin
The DATA_PATH
can be set :
DATA_PATH='1 /path/dataset'
For dataset preprocesss please refer to documentation.
Further command line arguments are described in the source file arguments.py
and REAMME.md