Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Update] loader.py , evaluate will run separate evaluations on each eval_dataset #5522

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

SrWYG
Copy link

@SrWYG SrWYG commented Sep 24, 2024

If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation

seq2seqtrainner support eval_dataset as Dict.

What does this PR do?

Fixes # (issue)

Before submitting

  • [ ✅ ] Did you read the contributor guideline?
  • [ ✅ ] Did you write any new necessary tests?
    • I test it in alpacha format data,mode sft, model qwen2.5-7B-Instruct ,both single GPU and 2 GPU using fsdp.
    • the loss will be print and be logged in tensorboard run logs, which can be filtered by _loss in your tensorboard webUI.

…ataset.

`If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation`

seq2seqtrainner support eval_dataset as Dict.
@hiyouga hiyouga added the pending This problem is yet to be addressed label Sep 24, 2024
if merge:
return merge_dataset([data for _, data in datasets.items()], data_args, seed=training_args.seed)
else:
return datasets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does dict match the return value type Optional[Union["Dataset", "IterableDataset"]]?

@chengchengpei
Copy link
Contributor

can we add some tests because the change it pretty huge..

@SrWYG
Copy link
Author

SrWYG commented Oct 12, 2024

Test case yaml:

### model
## Testing on the 72b model is more convincing than the 7b model
model_name_or_path: /data3/models/Qwen2.5-72B-Instruct

### method
stage: sft
do_train: true
finetuning_type: lora
lora_target: q_proj,v_proj

### dataset
dataset: identity,alpaca_en_demo
packing: true
dataset_dir: data/
template: qwen
cutoff_len: 8000
## for test
max_samples: 100
overwrite_cache: true
preprocessing_num_workers: 16

### output
output_dir: saves/Qwen2.5-72B-Instruct/lora/test
logging_steps: 10
plot_loss: true
overwrite_output_dir: true

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 16
learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
# bf16: true
fp16: true
ddp_timeout: 180000000
## It‘s ok with or without quantization
quantization_bit: 4
# neftune_noise_alpha: 5
lora_rank: 16
flash_attn: fa2


### eval
## You can specify val_size to divide a whole set from the whole trainset.
# val_size: 0.05
## Alternatively, specify eval_dataset to evaluate separately on each set
eval_dataset: alpaca_en_demo,alpaca_zh_demo
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 10

save_steps: 10
eval_steps: 10

# low_cpu_mem_usage: False
# ddp_find_unused_parameters: False

You can test it in multi-gpus with fsdp:

CUDA_VISIBLE_DEVICES=0,1 accelerate launch \
    --config_file examples/accelerate/fsdp_config.yaml \
    src/train.py examples/train_lora/qwen2_lora_sft.yaml

or, single gpu but 7b model

CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/train_lora/qwen2_lora_sft.yaml

It's only a change about eval datasets, so i didn't test more models.

and
separate evaluation loss will printed on training log and tensor board webUI.

tensorboard --logdir=saves/Qwen2.5-72B-Instruct/lora/test/runs --bind_all --port=6006

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants