-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable torch.compile with ZeRO (Experimental) #4878
Conversation
@stas00, FYI |
Amazing work, @tohtana! I'm looking forward to trying it out Here is a quick feedback: Could we please flip
|
tried it out and the compiled engine doesn't seem to forward some (all?) custom methods to the unwrapped model, e.g. it's failing:
This method is just part of the normal model. |
I hacked around it via This is just training Llama-2 on a single node using Accelerate with torch-nightly from last night. The llama model is the same as HF Transformers with some additional methods. https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py |
If I disable the ds profiler than it runs despite the compilation errors/warnings - same log as in the previous comment, other than the last traceback where it crashes. |
I'm also observing a very strange behavior of performance cycling: the tflops go like this per iteration: 196, 196, 192, 196, 196, 192, 196, 196, 192, - 2 fast one slower - very exactly w/o compile it was a consistent 194. so this tells me something gets recompiled every 3 iterations. |
@stas00 Thank you for your feedback! This PR is still experimental. Let me address the issues one by one. The configuration disable is what I specifically sought feedback on. Currently, all configuration items under compile are passed to torch.compile, which accepts disable, not enable. This design was chosen for its simplicity, given the uncertainty of future changes in torch.compile. But we can define Do you have any further comments on this? If not, I will switch it to enable as you suggested. Actually, it is also my personal preference. |
That's totally understandable, Masahiro. Tunji made that clear when he tagged me. If it's too early to provide feedback please ping me when you're ready for it.
Ideally, Deepspeed users will never need to know anything about Since most (all?) Deepspeed config sections use But this is an opinion of a single person, so please seek out opinions of others. |
@stas00 Thank you for your quick reply. Probably it is difficult to have a clear conclusion for now. I will simply switch it to |
2a. I don't know if the current config file allows for a not predefined dict, so perhaps this could be possible:
this should definitely work:
but I don't know if all but providing a programmatical API for power users would be the most fool-proof: |
Tests running older version of torch will fail the compile tests added in #4878.
This PR enables `torch.compile` with ZeRO stages 1/2/3. You need to add `compile` section in your DeepSpeed config. The fields in the section are passed to `torch.compile`. ```json "compile": { "disable": false, "backend": "inductor" } ``` To enable a custom backend, you can pass the fully qualified name of the backend function. For example, if you have a backend class `my_backend` in `my_backend.py` in the current directory, you can enable it by `"backend": "my_backend.my_backend"`. You can find an example in [a unit test](https://github.com/microsoft/DeepSpeed/blob/eb9d4e06e9596f391aea305a6a5c6ec70cc28b58/tests/unit/runtime/compile/test_config.py#L116). Currently we validated the results with Megatron-DeepSpeed. See the [example](https://github.com/microsoft/Megatron-DeepSpeed/tree/tohtana/enable_compile/examples_deepspeed/compile) for the details. NOTICE: This PR is a draft. We will need to validate the coverage and accuracy with many more examples. --------- Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Michael Wyatt <[email protected]>
Tests running older version of torch will fail the compile tests added in microsoft#4878.
return backend | ||
|
||
elif isinstance(backend, str): | ||
if backend in torch._dynamo.list_backends(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tohtana The default list_backends
call will exclude debug and experimental backends, e.g. eager
. I think it's better to use list_backends(exclude_tags=())
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the comment. I opened #5191.
As mentioned at #4878 (comment), we are currently unable to enable debug or experimental backends for the compiler. This PR enables users to utilize these backends.
As mentioned at microsoft#4878 (comment), we are currently unable to enable debug or experimental backends for the compiler. This PR enables users to utilize these backends.
This PR enables `torch.compile` with ZeRO stages 1/2/3. You need to add `compile` section in your DeepSpeed config. The fields in the section are passed to `torch.compile`. ```json "compile": { "disable": false, "backend": "inductor" } ``` To enable a custom backend, you can pass the fully qualified name of the backend function. For example, if you have a backend class `my_backend` in `my_backend.py` in the current directory, you can enable it by `"backend": "my_backend.my_backend"`. You can find an example in [a unit test](https://github.com/microsoft/DeepSpeed/blob/eb9d4e06e9596f391aea305a6a5c6ec70cc28b58/tests/unit/runtime/compile/test_config.py#L116). Currently we validated the results with Megatron-DeepSpeed. See the [example](https://github.com/microsoft/Megatron-DeepSpeed/tree/tohtana/enable_compile/examples_deepspeed/compile) for the details. NOTICE: This PR is a draft. We will need to validate the coverage and accuracy with many more examples. --------- Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Michael Wyatt <[email protected]>
Tests running older version of torch will fail the compile tests added in microsoft#4878.
As mentioned at microsoft#4878 (comment), we are currently unable to enable debug or experimental backends for the compiler. This PR enables users to utilize these backends.
This PR enables
torch.compile
with ZeRO stages 1/2/3. You need to addcompile
section in your DeepSpeed config. The fields in the section are passed totorch.compile
.To enable a custom backend, you can pass the fully qualified name of the backend function. For example, if you have a backend class
my_backend
inmy_backend.py
in the current directory, you can enable it by"backend": "my_backend.my_backend"
. You can find an example in a unit test.Currently we validated the results with Megatron-DeepSpeed. See the example for the details.
NOTICE: This PR is a draft. We will need to validate the coverage and accuracy with many more examples.