Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching should only be done on rank 0 #284

Open
shatz01 opened this issue Feb 26, 2023 · 1 comment
Open

Caching should only be done on rank 0 #284

shatz01 opened this issue Feb 26, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@shatz01
Copy link
Collaborator

shatz01 commented Feb 26, 2023

Describe the bug
Caching a dataset when running a DDP job breaks.

Solution
Guard any caching with checking for environment variable os.environ["RANK"].

@shatz01 shatz01 added the bug Something isn't working label Feb 26, 2023
@shatz01
Copy link
Collaborator Author

shatz01 commented Feb 26, 2023

An intermittent fix is simply to first run your training script on 1 gpu, and once it has cached run it DDP with reset_cache=False.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant