Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock for writing to the cache files #228

Open
eugene-yang opened this issue Mar 26, 2023 · 2 comments
Open

Lock for writing to the cache files #228

eugene-yang opened this issue Mar 26, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@eugene-yang
Copy link
Contributor

I sometimes have multiple processes or multiple machines accessing the same storage cluster that hosts the cache directory of ir_datasets. If multiple processes decide to download the same dataset at the same time, they start writing to the same file and eventually crash.
It would be nice if there is a locking mechanism that prevents more than one process from writing to the same file and asking other processes to wait.

@eugene-yang eugene-yang added the enhancement New feature or request label Mar 26, 2023
@seanmacavaney
Copy link
Collaborator

Thanks for reporting! I’ll look into it.

@bpiwowar
Copy link
Contributor

bpiwowar commented Jul 7, 2023

Yes I have the same issue with downloading - but also with processes like building the docstore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants