-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling of constraints on requirements with extras #12095
Changes from 8 commits
5bebe85
937d8f0
5f8f40e
d09431f
49027d7
cb0f97f
3160293
1038f15
8aa1758
faa3289
7e8da61
ff9aeae
3fa373c
e569017
cc6a2bd
fc86308
4ae829c
dc01a40
292387f
39e1102
e6333bb
1207389
6663b89
314d7c1
cc909e8
6ed231a
55e9762
504485c
f4a7c0c
32e95be
ba761cd
3f3ae6f
21bfe40
0de374e
5a01679
9041602
4e73e3e
50cd318
ff9e15d
f5602fa
449522a
952ab6d
fbda0a2
ce94946
46707a4
89b68c6
0f543e3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -240,7 +240,7 @@ def _prepare(self) -> BaseDistribution: | |
def iter_dependencies(self, with_requires: bool) -> Iterable[Optional[Requirement]]: | ||
requires = self.dist.iter_dependencies() if with_requires else () | ||
for r in requires: | ||
yield self._factory.make_requirement_from_spec(str(r), self._ireq) | ||
yield from self._factory.make_requirements_from_spec(str(r), self._ireq) | ||
yield self._factory.make_requires_python_requirement(self.dist.requires_python) | ||
|
||
def get_install_requirement(self) -> Optional[InstallRequirement]: | ||
|
@@ -392,7 +392,7 @@ def iter_dependencies(self, with_requires: bool) -> Iterable[Optional[Requiremen | |
if not with_requires: | ||
return | ||
for r in self.dist.iter_dependencies(): | ||
yield self._factory.make_requirement_from_spec(str(r), self._ireq) | ||
yield from self._factory.make_requirements_from_spec(str(r), self._ireq) | ||
|
||
def get_install_requirement(self) -> Optional[InstallRequirement]: | ||
return None | ||
|
@@ -427,9 +427,19 @@ def __init__( | |
self, | ||
base: BaseCandidate, | ||
extras: FrozenSet[str], | ||
ireq: Optional[InstallRequirement] = None, | ||
) -> None: | ||
""" | ||
:param ireq: the InstallRequirement that led to this candidate, if it | ||
differs from the base's InstallRequirement. This will often be the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel maybe we should remove the None case and make this always an ireq instance (i.e. if we didn’t do extra manipulation this would just be It might be useful for this argument (and attribute) to have a different name to indicate it’s only for reporting. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fine by me. I personally prefer to be explicit when reverting to defaults but it's a thin line in this case. I moved it up so that the attribute is not optional (50cd318), but I think it would be cleaner to leave the constructor argument optional for two reasons:
With this context, how would you like me to proceed? |
||
case in the sense that this candidate's requirement has the extras | ||
while the base's does not. Unlike the InstallRequirement backed | ||
candidates, this requirement is used solely for reporting purposes, | ||
it does not do any leg work. | ||
pradyunsg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
self.base = base | ||
self.extras = extras | ||
self._ireq = ireq | ||
|
||
def __str__(self) -> str: | ||
name, rest = str(self.base).split(" ", 1) | ||
|
@@ -502,11 +512,11 @@ def iter_dependencies(self, with_requires: bool) -> Iterable[Optional[Requiremen | |
) | ||
|
||
for r in self.base.dist.iter_dependencies(valid_extras): | ||
requirement = factory.make_requirement_from_spec( | ||
str(r), self.base._ireq, valid_extras | ||
yield from factory.make_requirements_from_spec( | ||
str(r), | ||
self._ireq if self._ireq is not None else self.base._ireq, | ||
valid_extras, | ||
) | ||
if requirement: | ||
yield requirement | ||
|
||
def get_install_requirement(self) -> Optional[InstallRequirement]: | ||
# We don't return anything here, because we always | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
import contextlib | ||
import functools | ||
import logging | ||
import os | ||
|
@@ -11,6 +12,7 @@ | |
from pip._internal.cache import WheelCache | ||
from pip._internal.index.package_finder import PackageFinder | ||
from pip._internal.operations.prepare import RequirementPreparer | ||
from pip._internal.req.constructors import install_req_extend_extras | ||
from pip._internal.req.req_install import InstallRequirement | ||
from pip._internal.req.req_set import RequirementSet | ||
from pip._internal.resolution.base import BaseResolver, InstallRequirementProvider | ||
|
@@ -19,6 +21,7 @@ | |
PipDebuggingReporter, | ||
PipReporter, | ||
) | ||
from pip._internal.utils.packaging import get_requirement | ||
|
||
from .base import Candidate, Requirement | ||
from .factory import Factory | ||
|
@@ -101,9 +104,19 @@ def resolve( | |
raise error from e | ||
|
||
req_set = RequirementSet(check_supported_wheels=check_supported_wheels) | ||
for candidate in result.mapping.values(): | ||
# sort to ensure base candidates come before candidates with extras | ||
for candidate in sorted(result.mapping.values(), key=lambda c: c.name): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, I’m not sure if sorting the entire list would break some things (we are generally quite delibrate to keep the input order in most cases). But I guess if it’s otherwise too cumbersome to sort a base candidate before extras we can just do this for now until (if) anyone complains. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You make a good point. I don't think the order is very important at this late stage in the process, but best not to mess with it too much just in case. I changed this to a more conservative sort key in 21bfe40. I think this could be a nice middle ground: it only moves the extras to the rear, which should hardly affect anything. What do you think? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds like a good middle ground if it does not affect performance too much. (Another approach I was considering is to loop through the list once to process the non-extras ones, while collecting extras to be processed in a second loop.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was going to argue with respect to readability but now that I look it over once more I actually prefer the two-pass approach you suggest. Since we already branch on the presence of the extra, there's really no reason to keep it in the same loop. So I'll make the change. Out of curiosity, I did run a quick test on the performance difference. I would have expected it to be negligible but this snippet seems to indicate that there is a slight performance hit for the sorting approach, though it's difficult to make it completely representative. It's still in the same order of magnitude though, which is all that matters for many practical cases. import random
import timeit
nb_elements: int = 10_000
nb_iterations: int = 10_000
l = random.choices([0, 1], weights=[9, 1], k=nb_elements)
def with_sort():
for i in sorted(l, key=lambda i: i):
pass
def pass_twice():
for i in l:
if i == 0:
pass
for i in l:
if i == 1:
pass
def pass_twice_cache():
second_pass: list[int] = []
for i in l:
if i == 0:
pass
else:
second_pass.append(i)
for i in second_pass:
pass
print("with_sort", timeit.timeit(with_sort, number=nb_iterations))
print("pass_twice", timeit.timeit(pass_twice, number=nb_iterations))
print("pass_twice_cache", timeit.timeit(pass_twice_cache, number=nb_iterations)) result:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is not right, sorry about that. We branch on the presence of the extra iff there is no ireq. Extras candidates with an ireq require the same logic as base candidates with an ireq, so unless you think the performance difference is sufficiently significant to warrant the change I'm inclined to leave it as is after all. This also means that the "middle ground" still reorders a bit more than I initially thought (because not all extras candidates have a corresponding base, meaning their processing may shift to the rear). Let me know if you think this is unacceptable, then I'll make an effort to implement smarter (but more complex) sorting. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don’t think performance is the issue here, the thing I’m worried about is mostly doing to much reordering may affect semantics, as mentioned above. I’m willing to opt for the easiest logic possible and see if it really affects anyone in practice. |
||
ireq = candidate.get_install_requirement() | ||
if ireq is None: | ||
if candidate.name != candidate.project_name: | ||
# extend existing req's extras | ||
with contextlib.suppress(KeyError): | ||
req = req_set.get_requirement(candidate.project_name) | ||
req_set.add_named_requirement( | ||
install_req_extend_extras( | ||
req, get_requirement(candidate.name).extras | ||
) | ||
) | ||
continue | ||
|
||
# Check if there is already an installation under the same name, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of trying to modify a Requirement in-place (which IIRC is not documented to be an officially supported operation), it may be better to strip the extras in the requirement string instead. This should not be too dificult with some regex since both the project and extra names are quite predictable.
(I’m not committed to this though and wouldn’t object if packaging maintainers are OK with this cc @pradyunsg)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the change though I don't like the fact that it feels a bit more brittle. On the other hand I do agree with your sentiment that I shouldn't something that isn't part of
packaging.Requirement
's contract. @pradyunsg do you have an opinion on this (faa3289)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy if
packaging
made it a supported operation, but the existing docs state that theRequirement
class is for parsing the string form of a requirement, which to me doesn't imply the result is mutable. You could read it as saying that it parses the string into a structured form, which you are allowed to them mutate if you want, but I don't think that automatically follows.So as things stand, I agree with @uranusjr.
As @pradyunsg is a
packaging
maintainer, if he says it's OK, then I'm fine with that (although I'd ask that the docs be changed to explicitly state what's supported).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, don't modify this in place. It doesn't break things as implemented right now but making this a frozen dataclass is something I want to do in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your input. A frozen dataclass would be nice, that would then allow for
dataclasses.replace
, which I'd always prefer over in-place modifications.