Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdm: support additional index/source URLs #998

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

abathur
Copy link

@abathur abathur commented Jun 26, 2024

I doubt this is an ideal implementation, but I'm hoping that an imperfect incremental improvement to a WIP module is acceptable :)

When I took a look at switching a project that uses a private package from the pip to WIP pdm module, I noticed that pdm did pick up the alternate index when I imported the requirements. They look something like:

[[tool.pdm.source]]
name = "some hash or random identifier pdm generated"
url = "https://some-other-index.example.com/"
verify_ssl = true

I looked into how to leverage it in dream2nix and saw at least two paths:

  1. Let pdm deal with this at lock time by using the static_urls locking strategy so that it locks a full index-specific url for each ~file.
  2. Since fetchFromLegacy already supports a single url or a list of urls, just aggregate extra sources and pass them to the fetcher.

The branch name here betrays that I started with the first approach, but I backed off of that for two reasons:

  • I was already unsure whether locking the files down to a single index is broadly desirable for our use of pdm or if it will just end up trading one set of problems for another.
  • I initially thought I might be able to just drop either the file or url arguments to fetchFromLegacy and pass the fully resolved source.url to the other argument, but that didn't work. Since the fetcher's in another repo, I was either going to need to go change it upstream first or introduce a variant there or figure out where to put a variant of it local to this repo. Working around it started to smell like a yak shave--especially since I'm not sure locking the files down to each index is even acceptable.

Falling back to the second approach was straightforward and I've confirmed it works for my fairly simple case--getting a package that is not on pypi.org from some other index--but I imagine there are more complex cases that will require refining this support.

(But I'm not sure if that'll be by doing the second approach in a more rigorous way, or adopting the first approach?)

@purepani
Copy link
Contributor

We should definitely do the second approach I think, since it would be better if we didn't force people to refresh their lock file if they already have a pdm repo.

@abathur
Copy link
Author

abathur commented Jun 27, 2024

Force pushed with reformat and a rebase against main that GH was nagging about :)

@phaer
Copy link
Member

phaer commented Jun 28, 2024

@abathur I am afraid you'd need to re-base again, sorry.

but I imagine there are more complex cases that will require refining this support.

On that end, could you maybe add a test case? Either add a non-pypi index to one of the examples and/or to the nix-unit testsuite?

pdm supports extra indexes in tool.pdm.source blocks and we can also
pass a list of urls to fetchFromLegacy, so this aggregates URLs from
the source blocks and passes them.

I think this will unblock simple cases like getting a package that is
not on pypi.org from some other index, but I imagine that there are
more complex cases that will require refining this support.
@abathur
Copy link
Author

abathur commented Jun 28, 2024

On that end, could you maybe add a test case? Either add a non-pypi index to one of the examples and/or to the nix-unit testsuite?

@phaer I'm not strictly opposed if it's simple/straightforward (and I did ponder it briefly), but I'm... not convinced it will be? :)

Fair warning that this is more of a jumble of thoughts than a full narrative:

  • I'm not aware of any public non-pypi index that we could trust at least as much as pypi not to inject malware and not just contain the same packages. There are mirrors AFAIK, but they won't help with the most obvious test--depending on something not present on pypi.org.

  • I imagine the ~safe way to do it is to run a real or mock package index local to the test with a randomly-named package (i.e., random within the test run so there's no stable identifier for someone to squat on pypi).

  • End-users will be exposed to this risk as well, but this has long been the case with extra indexes and is hopefully known to most people who'd be in this position. The full URL locking approach would probably help mitigate the risk this goes unnoticed, but end users should have a chance to notice an unexpected hash shift via either approach. I'm a little more ~concerned about the risk of using a stable package name for the test in the context of a test file that may get regenerated by automation or without much attention and leave someone nefarious with a chance to squat on the package up at pypi.

  • The full URL locking approach might not really entail extra testing beyond whatever applies to the fetcher changes it'd need, since it's really just leaning on the underlying pdm behavior to support extra indexes without special handling in dream2nix.

  • It looks like pdm2nix crossed this bridge in:

    That said, it wasn't immediately clear to me from the above or from https://github.com/search?q=repo%3Aadisbladis%2Fpdm2nix+resolvelib&type=code how/whether that test is affirming that the package wasn't just resolved from the default pypi, since the "extra" source they add is just pypi.org/simple.

@phaer
Copy link
Member

phaer commented Jun 29, 2024

I'm not aware of any public non-pypi index that we could trust at least as much as pypi not to inject malware and not just contain the same packages.

But wouldn't we wan't to check a hash to see whether it's the same package, anywhere? Not sure I see the problem here.

Otherwise a simple unit-test is good enough IMO.

@abathur
Copy link
Author

abathur commented Jun 29, 2024

I'm not aware of any public non-pypi index that we could trust at least as much as pypi not to inject malware and not just contain the same packages.

But wouldn't we wan't to check a hash to see whether it's the same package, anywhere? Not sure I see the problem here.

Otherwise a simple unit-test is good enough IMO.

I feel like we might be talking past each other. I have no sense of what the simple unit-test that actually exercises this feature looks like.

Without the static_urls locking strategy, nothing about the additional index will end up in the lockfile.

If i just added some public pypi mirror to a pyproject.toml as an extra source and called it a test, the only thing we'd be testing is that adding the extra source doesn't break anything. The lockfile will be unchanged and the fetcher will still just get the package from the first source (hardcoded pypi.org).

If you see a simple way to test it, I may need a little more direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants