Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trainer execute() method uses strategy cleanup() before program is finished #223

Open
jarlsondre opened this issue Oct 10, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@jarlsondre
Copy link
Collaborator

Currently the standard TorchTrainer class calls self.strategy.clean_up() at the end of execute(), but for certain use cases such as when profiling this can be problematic as you cannot access the strategy methods after this. Additionally, even though you have called clean_up(), multiple processes are still running, meaning that by calling clean_up() you're really just removing the control of the strategy while still having it run.

The solution to this would probably involve moving some of the logic of the strategy out of the TorchTrainer class or changing the functionality so that clean_up() kills the processes. Killing the processes could also be bad, though, as you might want to be able to run them after the train() function has finished (such as in the profiling case).

@jarlsondre jarlsondre added the enhancement New feature or request label Oct 10, 2024
@jarlsondre jarlsondre self-assigned this Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant