Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experimental_index_url in dependency tree breaks other pip.parse includes #2268

Closed
keith opened this issue Oct 2, 2024 · 4 comments · Fixed by #2325 · May be fixed by #2407
Closed

experimental_index_url in dependency tree breaks other pip.parse includes #2268

keith opened this issue Oct 2, 2024 · 4 comments · Fixed by #2325 · May be fixed by #2407
Assignees

Comments

@keith
Copy link
Member

keith commented Oct 2, 2024

In our project we don't use experimental_index_url, but through pulling in rules_mypy which does use it here https://github.com/theoremlp/rules_mypy/blob/c7c113c3608bc7569493c2abbff9aaa18119e145/MODULE.bazel#L29, our project's root pip.parse uses stopped working, because a platform specific variant of a dependency was included in the MODULE.bazel.lock.

I would expect experimental_index_url to only apply to the pip.parse that sets it, or that it only be settable by the root module. The only workaround I found is to patch that out of rules_mypy, which requires an unreleased version of bazel.

Here is a repro project repro.zip

In this project run bazel fetch ... on a non-linux x86_64 machine to see the breakage. The specific issue is that the requirements.txts that we use vary per platform, because pytorch releases different wheel types on different platforms. On linux x86_64, torch+cpu exists, but on macOS or linux aarch64 it doesn't exist, which leads to this download failure on those platforms:

ERROR: Could not find a version that satisfies the requirement torch==2.4.1+cpu (from versions: 2.0.0, 2.0.1, 2.1.0, 2.1.1, 2.1.2, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.4.0, 2.4.1)
ERROR: No matching distribution found for torch==2.4.1+cpu

Since in the case of rules_mypy it was only trying to set this for its internal dependency, it was a big surprise that it could break us in this way.

@aignas
Copy link
Collaborator

aignas commented Oct 3, 2024

So if I understand correctly this breaks when trying to use the MODULE.bazel.lock file, right?

I think the only solution, that I haven't gotten to doing yet is to implement a pypi.install extension that is using only using experimental_index_url and the old pip.parse would not have that feature at all. This is a backwards incompatible change, but it was on my to do list. I think the list of things we could do is:

  • Add a new bzlmod extension so that rules_mypy can have an alternative.
  • Ignore experimental_index_url on non-root modules.

@aignas aignas self-assigned this Oct 3, 2024
@keith
Copy link
Member Author

keith commented Oct 3, 2024

So if I understand correctly this breaks when trying to use the MODULE.bazel.lock file, right?

yes

Ignore experimental_index_url on non-root modules.

I haven't read extensively about what the purpose of experimental_index_url is, but if it's specifically about download behavior like this, I can definitely see that the root module should be the one deciding this, not random transitive dependencies

@aignas
Copy link
Collaborator

aignas commented Oct 7, 2024

The flag purpose is to support #260, but there were many things why it was not deemed as done and not enabled by default.

The TLDR of what it (the feature) does:

  • Contact the PyPI servers (or private indexes) to get the URLs for the packages.
  • Download them using the bazel downloader.
  • Setup config settings based on whl filenames so that everything still works when host != target platform.

aignas added a commit to aignas/rules_python that referenced this issue Oct 8, 2024
With this PR we are introducing a new extension for managing PyPI dependencies
and it seems that we would fix bazelbuild#2268 in doing so. The code has been sitting
around for long enough so that we have ironed out most of the bigger bugs
already and the maintainers have agreed for long enough that pip.parse is
not the best named, so I have chosen pypi.install.

This should not actively break the code and will make some of the experimental
parameters noop. This could yield to weird failures in cases where people are
building docker containers, so I am thinking that we should potentially add
a helpful error message prompting the users to migrate. However I am not
sure how that works out when multiple module versions are at play in the same
module graph.
@aignas
Copy link
Collaborator

aignas commented Oct 21, 2024

I think I gave it enough thought and the best way I could think of to solve this is to move this select_requirement call from the module_extension evaluation into a regular select statement in the hub repository. This will introduce a change in the spoke repo names, but I think it is a good enough thing to do. If someone is depending on the names, they will have to adapt.

The list of changes are:

I think the pypi.install proposal in #2278 is just too much work and the following patch to rules_python could disable experimental_index_urls for your root module until this issue is resolved:

diff --git a/python/private/pypi/extension.bzl b/python/private/pypi/extension.bzl
index 36fb20e0..4cf2d196 100644
--- a/python/private/pypi/extension.bzl
+++ b/python/private/pypi/extension.bzl
@@ -150,7 +150,7 @@ def _create_whl_repos(module_ctx, pip_attr, whl_map, whl_overrides, group_map, s
     # Create a new wheel library for each of the different whls

     get_index_urls = None
-    if pip_attr.experimental_index_url:
+    if False:
         get_index_urls = lambda ctx, distributions: simpleapi_download(
             ctx,
             attr = struct(

@aignas aignas mentioned this issue Oct 21, 2024
10 tasks
github-merge-queue bot pushed a commit that referenced this issue Oct 21, 2024
This PR introduces a new `parse_modules` function in the
`pypi/extension.bzl`
code to mimic the structure of the `python` extension and to make it
easier to
write unit tests against the extension itself. I have also written a few
unit
tests to verify the generic structure.

Work towards #2268.
aignas added a commit to aignas/rules_python that referenced this issue Nov 7, 2024
…azelbuild#2325)

With this change we finally generate the same lock file within the
legacy code `pip.parse` code path and it allows to slowly transition to
using the new code path as much as possible without user doing anything.

This moves the selection of the host-compatible lock file from the
extension evaluation to the build phase - note, we will generate extra
repositories here which might not work on the host platform, however, if
the users are consuming the `whl_library` repos through the hub repo
only, then everything should work. A known issue is that it may break
`bazel query` and in these usecases it is advisable to use `cquery`
until we have `sdist` cross-building from source fully working.

Summary:
- feat: reuse the `render_pkg_aliases` for when filename is not known
  but platform is known
- feat: support generating the extra config settings required
- feat: `get_whl_flag_versions` now generates extra args for the rules
- feat: make lock file generation the same irrespective of the host
  platform
- test: add an extra test with multiple requirements files
- feat: support cross-platform builds using `download_only = True` in
  legacy setups

Note, that users depending on the naming of the whl libraries will need
to start using `extra_hub_aliases` attribute instead to keep their
setups not relying on this implementation detail.

Fixes bazelbuild#2268
Work towards bazelbuild#260

---------

Co-authored-by: Richard Levasseur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment