Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider removing duplicate code in PyPI package name normalization #1330

Closed
aignas opened this issue Jul 19, 2023 · 2 comments
Closed

Consider removing duplicate code in PyPI package name normalization #1330

aignas opened this issue Jul 19, 2023 · 2 comments

Comments

@aignas
Copy link
Collaborator

aignas commented Jul 19, 2023

This is a followup to @chrislovecnm comment in #1329.

If #1329 lands, then there are the following parts where we have name sanitisation and normalization.

First are the requirement, whl_requirement, data_requirement, dist_info_requirement and entry_point macros are using a vendored copy of the function named _clean_name. We could just use the function introduced in #1329 most of the time, but we have to expose it either in //python/pip_install:normalize_name.bzl or somewhere else because as it is right now, it's invisible from the pip_parse hub repos.

Second is the Python implementation in //python/pip_install/tools/lib:bazel.py has a copy and we cannot remove it that easily. The only (semi) sensible way would be to change the implementation of the whl_library:

  • Starlark implementation declares a metadata.json output file that gets written by the wheel_installer.py script.
  • The script extracts or builds the wheel and outputs the metadata to the metadata.json file as opposed to creating a BUILD.bazel file.
  • Starlark implementation reads the metadata.json file and then constructs the BUILD.bazel file.

The benefit of the whl_library refactor would be that it would make the wheel installer a single purpose tool and we could pass the extra BUILD.bazel content or do patching within the starlark context and it would make annotation handling easier. The whole argument parsing for the wheel_installer.py is likely to also become simpler.

aignas added a commit to aignas/rules_python that referenced this issue Jul 22, 2023
Before this PR, the `wheel_installer` was doing three things:
1. Downloading the right wheel.
2. Extracting it into the output directory.
3. Generating BUILD.bazel files based on the extracted contents.

This PR is moving the third part into the `whl_library` repository rule
and it has the following benefits:
* We can reduce code duplication and label sanitization functions in
  rules_python.
* There are many things that the `wheel_installer` does not care anymore
  and we don't need to change less code when extending `whl_library` as
  we can now do many things in starlark directly.
* It becomes easier to change the API of how we expose the generated
  BUILD.bazel patching because we only need to change the Starlark
  functions.

Work towards bazelbuild#1330.
aignas added a commit to aignas/rules_python that referenced this issue Aug 1, 2023
Before this PR, the `wheel_installer` was doing three things:
1. Downloading the right wheel.
2. Extracting it into the output directory.
3. Generating BUILD.bazel files based on the extracted contents.

This PR is moving the third part into the `whl_library` repository rule
and it has the following benefits:
* We can reduce code duplication and label sanitization functions in
  rules_python.
* There are many things that the `wheel_installer` does not care anymore
  and we don't need to change less code when extending `whl_library` as
  we can now do many things in starlark directly.
* It becomes easier to change the API of how we expose the generated
  BUILD.bazel patching because we only need to change the Starlark
  functions.

Work towards bazelbuild#1330.
github-merge-queue bot pushed a commit that referenced this issue Aug 2, 2023
Before this PR, the `wheel_installer` was doing three things:
1. Downloading the right wheel.
2. Extracting it into the output directory.
3. Generating BUILD.bazel files based on the extracted contents.

This PR is moving the third part into the `whl_library` repository rule
and it has the following benefits:
* We can reduce code duplication and label sanitization functions in
  rules_python.
* There are many things that the `wheel_installer` does not care anymore
  and we don't need to change less code when extending `whl_library` as
  we can now do many things in starlark directly.
* It becomes easier to change the API of how we expose the generated
  BUILD.bazel patching because we only need to change the Starlark
  functions.

Work towards #1330.
@aignas
Copy link
Collaborator Author

aignas commented Aug 5, 2023

The only remaining part for this is to use the normalize_name in the hub macros, but I'll leave this for a later time because there is a discussion in #979 on the potential replacement for the repository macros.

@aignas
Copy link
Collaborator Author

aignas commented Nov 20, 2023

Completed via #1542

@aignas aignas closed this as completed Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant