Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High memory consumption without dataflow-based fusion #1762

Open
riccardofelluga opened this issue Feb 11, 2025 · 1 comment
Open

High memory consumption without dataflow-based fusion #1762

riccardofelluga opened this issue Feb 11, 2025 · 1 comment
Assignees
Labels
fusion logic thunderfx for things that could be applicable to the dynamo+thunder frontend

Comments

@riccardofelluga
Copy link
Collaborator

riccardofelluga commented Feb 11, 2025

⏱️ Perf regression

Disabling dataflow-based fusion logic introduced a substantial performance regression. In the models we are tracking, we can see an increase in memory usage.

To Reproduce

There are a few steps to follow:

pip install peft
pip uninstall bitandbytes

pytest thunder/benchmarks/targets.py::test_hf_transformers[mistralai/Mistral-Nemo-Base-2407-BS1-4096-PEFT-forward-thunderfx] --benchmark-json "out.json"

cat out.json | grep max_allocated_memory_MB

git revert f2d715240555ca787fbdbba6f42c6f9d422ae0c3

pytest thunder/benchmarks/targets.py::test_hf_transformers[mistralai/Mistral-Nemo-Base-2407-BS1-4096-PEFT-forward-thunderfx] --benchmark-json "out.json"

cat out.json | grep max_allocated_memory_MB

This will print the peak memory consumption for both runs, on H100 this prints ~71GB and ~67GB after reverting.

Additional info

With this issue, I am not looking to revert the commit, but to track the process of bringing back the dataflow fusion and taking advantage of the moment to brainstorm a bit on it and improve it from the state that it was before being deleted.

Mentioning PR for tracking #1763

cc @riccardofelluga

@riccardofelluga riccardofelluga self-assigned this Feb 11, 2025
@tfogal tfogal added the thunderfx for things that could be applicable to the dynamo+thunder frontend label Feb 12, 2025
@IvanYashchuk IvanYashchuk changed the title High memory consumption without fusion rematerialization High memory consumption without dataflow-based fusion Feb 19, 2025
@IvanYashchuk
Copy link
Collaborator

How do fusions change for mistralai/Mistral-Nemo-Base with and without f2d7152?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fusion logic thunderfx for things that could be applicable to the dynamo+thunder frontend
Projects
None yet
Development

No branches or pull requests

3 participants