Trainer execute() method uses strategy cleanup() before program is finished #223

jarlsondre · 2024-10-10T14:06:49Z

Currently the standard TorchTrainer class calls self.strategy.clean_up() at the end of execute(), but for certain use cases such as when profiling this can be problematic as you cannot access the strategy methods after this. Additionally, even though you have called clean_up(), multiple processes are still running, meaning that by calling clean_up() you're really just removing the control of the strategy while still having it run.

The solution to this would probably involve moving some of the logic of the strategy out of the TorchTrainer class or changing the functionality so that clean_up() kills the processes. Killing the processes could also be bad, though, as you might want to be able to run them after the train() function has finished (such as in the profiling case).

The text was updated successfully, but these errors were encountered:

jarlsondre added the enhancement New feature or request label Oct 10, 2024

jarlsondre self-assigned this Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trainer execute() method uses strategy cleanup() before program is finished #223

Trainer execute() method uses strategy cleanup() before program is finished #223

jarlsondre commented Oct 10, 2024

Trainer execute() method uses strategy cleanup() before program is finished #223

Trainer execute() method uses strategy cleanup() before program is finished #223

Comments

jarlsondre commented Oct 10, 2024