Fix: ensure `deepspeed.initialize` can only be called once. #6848

traincheck-team · 2024-12-10T22:24:33Z

Partially Fixes: #6772 #6771 #6770 by forbidding repeated initialization.

Introduced a module-level variable _deepspeed_initialized to track initialization state.
Raised a RuntimeError if initialize is called more than once.
Ensured initialization logic is executed only on the first call.

This is a draft issue as we have not included the unit test for this PR

@tjruwase
We currently added a global variable to enforce that deepspeed.initialize can only be called once in one single interpreter.
However, this seems to have interfered with existing tests that might intentionally call this API multiple times.

A solution might be to record id of the models and optimizers that have been used, to enforce the semantics only for individual models and optimizers, but that might get trickier and complicated, thus we'd like to hear some feedback on how this should be implemented before proceeding.

tjruwase · 2024-12-11T01:00:47Z

@traincheck-team, thanks for creating this PR so quickly. Unfortunately, it seems there is some miscommunication of the expectation. We want to maintain support for multiple deepspeed.initialize() calls as it is a widely used feature such as in DeepSpeed-Chat and Accelerate. Instead, what we want to prevent is repeated deepspeed.initialize() on a specific model, optimizer, lr_scheduler, etc. instance. And so, the deepspeed-specific attribute indicating prior deepspeed.intialize() should be attached to those objects.

What do you think?

traincheck-team · 2024-12-11T01:18:51Z

That makes sense and clear. Will change the code soon.

traincheck-team requested review from loadams and tjruwase as code owners December 10, 2024 22:24

traincheck-team closed this Dec 14, 2024

traincheck-team force-pushed the master branch from 2e19cfa to fc7c070 Compare December 14, 2024 19:09

traincheck-team mentioned this pull request Dec 16, 2024

Fix: forbid repeated deepspeed.initialize on training objects #6874

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: ensure `deepspeed.initialize` can only be called once. #6848

Fix: ensure `deepspeed.initialize` can only be called once. #6848

traincheck-team commented Dec 10, 2024

tjruwase commented Dec 11, 2024

traincheck-team commented Dec 11, 2024

Fix: ensure deepspeed.initialize can only be called once. #6848

Fix: ensure deepspeed.initialize can only be called once. #6848

Conversation

traincheck-team commented Dec 10, 2024

tjruwase commented Dec 11, 2024

traincheck-team commented Dec 11, 2024

Fix: ensure `deepspeed.initialize` can only be called once. #6848

Fix: ensure `deepspeed.initialize` can only be called once. #6848