-
Notifications
You must be signed in to change notification settings - Fork 967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uv dendency resolution upper-bound problem leading to a suboptimal resolution #8128
Comments
Hey. I'm happy to look into why that version of datasets is selected, but it's likely that this is an equally valid resolution to whatever you're seeing with |
(I'll try to figure out the dependencies that are leading to this resolution, though. It's typically because something else you depend on is applying an upper-bound on datasets.) |
@charliermarsh however the key is it's not. Using |
Can you clarify your comment? There are many equally-valid resolutions for that set of requirements. Separately, do you have a reference to pip's resolution that I can use for comparison (e.g., a comparable CI run)? |
@charliermarsh sure. Currently working run: https://github.com/huggingface/accelerate/actions/runs/11285497593/job/31412054466?pr=3154 Broken run: https://github.com/huggingface/accelerate/actions/runs/11293796747/job/31412771583 (See the Show Installed Libraries steps) Otherwise phrased as:
Everything is the same between both runs except using |
Ok... So it looks like the uv resolution ends up choosing a more recent version of fsspec ( pip's resolution is more intuitive in that case (and we're working on adding better heuristics for these kinds of upper-bound cases -- it also happens often with Numba, which puts an upper-bound on Numba), but formally both are correct -- they're just different solutions given the set of constraints. That's why I'd typically recommend adding a \cc @konstin as another example of this upper-bound problem leading to a suboptimal resolution |
Sadly doing such a large pin like that is a bit of a non-negotiable with our testing and resolutions since we build our frameworks to be backwards compatible as much as possible. I'd rather wait to upgrade to |
To clarify, I consider the resolution here to be formally correct, but unintuitive for users (and want to change it -- the same pattern has come up before). |
Another option is you can reorder your dependencies to instruct uv to prioritize datasets over other packages. |
I want to offer a more general perspective on version constraints: Using the APIs of your dependencies, you require at least a certain version of that dependency implictly: There's a specific version under which your code works, and lower version that doesn't have the APIs you use yet. By specifying lower bounds on your dependencies, you make those bounds explicit, and you avoid your users accidentally installing an older, incompatible version due to other (transitive) dependencies they have: In your case, datasets v3 is selectable, but in other cases, a user may depend on a library foo that has You can test your lower version bounds with |
As a user who was advertised to use uv as a drop in replacement, this comes with expectations that behaviors, including resolutions, will be the same between uv and standard pip. The point of this issue is it's not, and if I have the choice of running a debugging script to figure out sudden lower bounds and using the same, I will stick to using what doesn't break. (Until it's fixed) |
As someone who contributes to pip's resolution code it should be noted that pip makes no guarantees it will produce the same resolution between pip versions, in fact pip makes no guarantees it would produce the same resolution on the same version of pip when run twice in a row. Pip only guarantees that if it produces a resolution, it will be a valid resolution, so if uv is doing that then I would consider that compatible with pip. As pip can just as easily break your assumptions about what it should resolve to between versions. That said, I have been a long advocate that Python package resolution algorithms should prefer requirements with upper bounds. I plan to open a PR this weekend to solve pypa/pip#12993 on the pip side, it is a less extreme version of my previous suggestion but it seems to have significant real world improvement, in my testing, of both solving faster and giving a resolution more intuitive to a user. I think uv should probably do the same, but they would need to do their own testing. But, like, if uv could hold off for a bit, I might be able to brag I made a PR that made pip resolve faster than uv in a handful of extreme edge cases 😉 . |
I've confirmed that more recent versions of uv correctly resolves to |
Hi! I'm trying to upgrade accelerate to use
uv pip install
and we're hitting a weird issue where the dependency resolution wants to only install a very old version ofdatasets
and as a result our suite fails. I'm specifying the exact same environment and everything as the working old installation via justpip install
& python 3.8. As a result it should be installing datasets 3.0.0 like the original workflow does, however we're finding it's installing an old 2.x.x version instead.Any help would be appreciated.
Here is the old workflow and here is the new one.
A broken run log is available here, which is showing that the wrong version of
datasets
is being installed (should be 3.0.0+)The text was updated successfully, but these errors were encountered: