Skip to content

Latest commit

 

History

History
43 lines (25 loc) · 1.24 KB

instruct_tuning.md

File metadata and controls

43 lines (25 loc) · 1.24 KB

Yuan2.0 Supervised Finetuning

Introduction

This document provides instructions for supervised finetuning (SFT) of Yuan2.0.

Usage

An example script to run Yuan-102B SFT is:

bash examples/pretrain_yuan2.0_102B_sft.sh

Arguments setting

Before running the script, the relevant arguments should be set correctly.

Firstly, make any desired modifications including setting the environment variables for CHECKPOINT_PATH, DATA_PATH, TOKENIZER_MODEL_PATH and TENSORBOARD_PATH.

--train-reset allows you to begin your training iters from 0. --sft-stage is highly recommended to be on since it control the calculate of loss mask during SFT. --override-opt-param-scheduler allows you to set your own scheduler. --finetune load model for finetuning. do not load optimizer or rng state from checkpoint and set iters to 0. Assumed when loading a release checkpoint.

If the dataset path is:

/path/dataset.bin

The DATA_PATH can be set :

DATA_PATH='1 /path/dataset'

For dataset preprocesss please refer to documentation.

Further command line arguments are described in the source file arguments.py and REAMME.md