-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Fix wheel cache concurrency when temp and cache are not on the same filesystem #13541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Is this writing temporary files to the cache directory even if the user has explicitly set If so I'm not comfortable with this, while generally having cache directory and temporary directory on the same mount is usually correct, there may be valid reasons users don't want to do that. In fact I'm not sure why the solution isn't to advise users to set the temporary directory and the cache directory to the same mount? |
I don't think pip will write temporary files there, so it depends on the build backend. I had a quick look at PEP 517 and it does not seem to say anything about whether the build backend can use I started with an alternative solution which is passing a
I am considering this typical use case of a Dockerfile where the pip cache is mounted: RUN --mount=type=cache,target=/root/.cache/pip,id=pip-distro \
pip install stuff In that case I don't think it is possible to force the temp dir to be on the same filesystem as the cache mount. |
I can't validate right but my instinct is you should be able to update this to something like: RUN --mount=type=cache,target=/root/.cache/pip,id=pip-distro \
TEMPDIR=/root/.cache/pip/tmp pip install stuff Or set TEMPDIR as an ENV var earlier on the docker build script. I can try a few approaches later tonight and validate, or invalidate, whether an approach like this works. |
Well, okay that will work, but that sounds like a hack right? And I'm not quite convinced that using the docker cache mount as temp location is good practice. Unless I am missing something this temp_dir I'm changing here is passed directly as |
How temp env vars work is inherited from Python's behavior: https://docs.python.org/3/library/tempfile.html#tempfile.gettempdir I don't think this is a hack? Or at least I've relied on it in the past.
Wait, isn't that the purpose of this PR? You want the temp location to be the same as the cache mount? Maybe I got confused what this is doing.
I'm not clear on this, I will need to read through the code and specs later tonight. |
Sorry I was not clear. Apologies if did sound offensive, that was not my intent. Setting TEMPDIR is totally fine of course. It's just that using a Docker cache mount as a generic temp location does not look quite right to me.
Kind of. But this temp location is used only for the result of the build which is immediately moved to the cache. So it is not for all temporary files created during the build. I just pushed another commit renaming that variable to make that more explicit. |
59f863e
to
e4b8ef0
Compare
No issues from my side! I didn't perceive any offense.
Ahh! This makes way more sense to me now. I'll refrain from reviewing any further until I can read more context and validate it on my own machine. |
No behavior change.
Avoid concurrency issues when storing wheels into the cache of locally built wheels, when the temporary build directory is on a different filesystem than the cache directory. This commit also improves performance in such situations, by directly building the wheel into the cache filesystem.
e4b8ef0
to
c44bf1b
Compare
I'm still a bit uneasy on this, this is creating a temporary directory in a cache directory where we weren't before without give the user anyway to override. Maybe I'm overthinking things, but I'm imagining some user depending on their temporary directory to be ephemeral and their cache directory generally being write only (no deletes). Or the cache directory is regularly synced to an archive, and now those temporary directories could get caught up in that. Though I agree this is almost certainly a good default, and maybe we see if anyone complains before considering user customization for this. |
# We can do a rename here because wheel_path is guaranteed to be | ||
# in the same filesystem as output_dir. An atomic is rename | ||
# to avoid concurrency issues when populating the cache. | ||
os.rename(wheel_path, dest_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why switch from shutil.move
to os.rename
? shutil.move
uses rename first and falls back to copy: https://docs.python.org/3/library/shutil.html#shutil.move
While the temporary path probably supports os.rename
, the cache directory could be on an esoteric filesystem that does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the cache filesystem does not support os.rename
, then letting shutil.move
fall back to a copy risks re-creating the concurrency issue I'm fixing here. I don't know if such esoteric file systems exist but perhaps we could address that if someone complains? I can commit to following up on any issue raised about this post release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair.
For reference this is the PR that changed it from os.rename
to shutil.move
: #2749
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking it should be os.replace()
instead of rename, though.
The alternative I initially considered to fix the concurrency issue, was keeping the current temp dir location, and then do a |
I agree this approach is better
FWIW uv explicitly supports concurrency, and I believe uses lock files, but I haven't spent any time looking at their design. |
Avoid concurrency issues when storing wheels into the cache of locally built wheels, when the temporary directory into which the wheel is built is on a different filesystem than the cache directory.
We do this by building the wheel to a temporary subdirectory next to the target then doing an atomic rename to the target destination.
This also improves performance in this situation by avoiding a copy from one filesystem to the other.
Fixes #13540