Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maybe memory leak leak occurs after evaluation when using enable_liger_kernel. #6085

Open
1 task done
upskyy opened this issue Nov 20, 2024 · 0 comments
Open
1 task done
Labels
pending This problem is yet to be addressed

Comments

@upskyy
Copy link

upskyy commented Nov 20, 2024

Reminder

  • I have read the README and searched the existing issues.

System Info

llamafactory==0.7.2.dev0
transformers==4.46.1
python==3.10.14

Reproduction

llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml  # or gemma

Expected behavior

Thank you for sharing such an amazing project. @hiyouga

When I used enable_liger_kernel: true for training, the training memory usage of the Gemma2 model dropped from around 60 GiB to 7 GiB.

However, after running evaluation, the memory usage jumps to 60 GiB, and even when resuming training, it doesn't return to the previous memory level, staying at 60 GiB instead. It seems like there might be a memory leak somewhere.

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Nov 20, 2024
@upskyy upskyy changed the title Memory leak occurs during evaluation when using enable_liger_kernel. Maybe memory leak leak occurs after evaluation when using enable_liger_kernel. Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

1 participant