Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding internal archive manifest #17

Open
ncoghlan opened this issue Oct 1, 2024 · 1 comment
Open

Consider adding internal archive manifest #17

ncoghlan opened this issue Oct 1, 2024 · 1 comment
Labels
Category: Enhancement New feature or request

Comments

@ncoghlan
Copy link
Collaborator

ncoghlan commented Oct 1, 2024

Python's wheel format (and package installation records in general), support recording full internal archive manifests, along with the expected hashes of included files. That internal manifest can optionally be signed with a JSON web signature (although publicly available wheel files almost never do so - the feature is more intended for privately built wheel archives with very specific deployment environments):

venvstacks intentionally removes these RECORD files, mostly for reproducibility reasons (since some of the hashes may relate to files that contain absolute paths to the build environment), but also to make it less likely regular Python package management tools will attempt to manipulate the environment contents.

To replace these removed files, venvstacks could create its own installation manifest at share/venv/metadata/RECORD.

To minimise the RECORD file size, an adjacent JSON file would be used to specify the relative base path for record entries (since base runtime environments would want to use the root folder, while layered environments would want to use the site-packages folder).

@ncoghlan ncoghlan added the Category: Enhancement New feature or request label Oct 1, 2024
@ncoghlan ncoghlan transferred this issue from another repository Oct 18, 2024
@ncoghlan
Copy link
Collaborator Author

Note that even if #28 means that the original RECORD files remain mostly intact, there are still additional files in the published archives that those files don't capture (like the injected postinstall.py script and sitecustomize.py module).

However, keeping the original RECORD files would mean that the archive level RECORD could just store the hashes for those files, rather than repeating all the individual file hashes for the distribution package contents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant