Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add --detect-licenses flag #45

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Conversation

ddelange
Copy link
Owner

@ddelange ddelange commented Feb 8, 2021

This draft will most likely not be finished, as the licencing topic is a rabbit hole which is practically impossible to do right due to lack of strictness in the ecosystem.

As pipgrip exclusively has access to wheels, many licenses will not be present (see code comments for examples) and would call for a source distribution fallback (not trivial).

Assuming authors distribute their packages correctly, legal files should be present in wheels (ref https://wheel.readthedocs.io/en/stable/user_guide.html#including-license-files-in-the-generated-wheel-file found at https://jwodder.github.io/kbits/posts/pypkg-mistakes/#top-level-readme-or-license-file-in-wheel), but sadly this is not the case (even pip's vendored licenses aren't reproduced in the pip wheel).

@ddelange ddelange force-pushed the recursive-licenses branch 2 times, most recently from 14c238f to 5b60da8 Compare February 9, 2021 11:04
@jdvala
Copy link

jdvala commented Jun 19, 2021

@ddelange I suppose the licence information would be available somewhere right? I mean on the repo level? If there is repo information available for the repo in setup.py we can use api's to query this information, just saying.

@ddelange
Copy link
Owner Author

ddelange commented Jun 24, 2021

Hi @jdvala 💥

Indeed the licence can often be found in the (metadata of the) repo. The repo (hosted source) can be any VSC type hosting or even a plain HTML sitemap or so.

Elaborating on the inline comment, technically speaking, if the licence is missing in the wheel (the distribution which pipgrip installs and is executed by the user), for most licenses that counts as a failure to reproduce the license. This violation aside, there is at this point technically no guarantee that the license you pick up from another distribution (e.g. the hosted source, or an sdist downloaded from pypi) will correspond to the distribution on your system. Usually, a licence is valid only in fulltext, delivered alongside the actual distribution or embedded in each file or so. There are also other legal files like AUTHORS, which might also be required information to build a complete/valid 'licence info package' (so more than just 'pipgrip': 'BSD-3') for a distribution you want to run.

Some existing tools I've seen provide some 'confidence level' for their licence labels, and mostly won't be able to back that up with the licence fulltext for that specific version. I guess under the 'something is better than nothing' philosophy, and the lack of licensing standardisation in the Python ecosystem, this technique of looking at e.g. hosted source, pypi warehouse metadata, source distributions (sdist) etc. as fallback is the best alternative currently.

@ddelange
Copy link
Owner Author

ddelange commented Jul 2, 2024

To get an overview of license information present in wheel metadata:

pip install -r requirements.txt -qq --no-deps --ignore-installed --disable-pip-version-check --dry-run --report - | jq -c '.install[].metadata | [.name, .license, (try .classifier | map(select(contains("License"))))]'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants