GitHub - toranb/sloth: python sftune, qmerge and dpo scripts with unsloth

Mistral 7B chat fine tuning

SFT with unsloth

git clone [email protected]:toranb/sloth.git
cd sloth
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
## add data.json with instruction, output pairs for supervised fine tune
python3 sftune.py

Merge from checkpoint (optional)

This cmd will merge a given checkpoint, creating a new model directory

rm -rf model
python3 zmerge.py --peft /home/toranb/sloth/workspace/checkpoint-2600

DPO alignment (optional)

mkdir fin
export DPO=/home/toranb/sloth/model
export JSON=/home/toranb/sloth/dpo.json
export OUTPUTDIR=/home/toranb/sloth/fin
## add dpo.json with prompt, chosen, rejected
python3 dpo.py --base $DPO --out $OUTPUTDIR --json $JSON

Dataset note

I'm having success with this SFT configuration using a dataset of 21k instruction, output pairs that are in total 3MIL tokens. This 21k dataset is a combination of 10k from a subset of airoboros and 11k from a proprietary dataset.

Installation note

I want pip install to work from the requirements.txt I have included here but sadly that rarely works so I'd ignore that detail and start with unsloth to be sure you have a solid installation.

As of April 2024, flash-attn has a problem so I'm using 2.5.6 to workaround the installer like so

python3 -m venv env
source env/bin/activate
pip install --upgrade pip
pip install scipy trl xformers wandb ninja einops peft accelerate bitsandbytes
pip install flash-attn==2.5.8
pip install "unsloth[cu118-ampere-torch211] @ git+https://github.com/unslothai/unsloth.git"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SFT with unsloth

Merge from checkpoint (optional)

DPO alignment (optional)

Dataset note

Installation note

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
README.md		README.md
data.json		data.json
dpo.json		dpo.json
dpo.py		dpo.py
legacy.py		legacy.py
requirements.txt		requirements.txt
sftune.py		sftune.py
zmerge.py		zmerge.py

toranb/sloth

Folders and files

Latest commit

History

Repository files navigation

SFT with unsloth

Merge from checkpoint (optional)

DPO alignment (optional)

Dataset note

Installation note

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages