Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to use version ranges for package-configurations #9918

Open
MarcelBochtler opened this issue Feb 11, 2025 · 8 comments
Open

Provide a way to use version ranges for package-configurations #9918

MarcelBochtler opened this issue Feb 11, 2025 · 8 comments
Labels
configuration About configuration topics

Comments

@MarcelBochtler
Copy link
Member

MarcelBochtler commented Feb 11, 2025

When scanning ORT as a package with ORT, one finding always shows up:

"LicenseRef-scancode-agpl-generic-additional-terms" to SpdxConstants.NOASSERTION,

This line is being detected as AGPL-3.0-or-later by scancode.

As this is not a license declaration, I am removing the finding with a package-configuration.

package_configurations:
- id: "Maven:org.ossreviewtoolkit:model:51.0.0"
  source_artifact_url: "https://repo.maven.apache.org/maven2/org/ossreviewtoolkit/model/51.0.0/model-51.0.0-sources.jar"
  license_finding_curations:
  - path: "config/ScannerConfiguration.kt"
    start_lines: "64"
    line_count: 1
    detected_license: "AGPL-3.0-or-later"
    concluded_license: NONE
    reason: CODE
    comment: "..."

This works well, and I like that I can remove this single finding in exactly this version.

However, I have to do that weekly, because I want to update ORT as a dependency as soon as possible, and every update requires me to write a new package configuration.

The only workaround to remove this finding for good is to overwrite every detected license for Maven:org.ossreviewtoolkit:model:, by just setting a concluded license with a version range.
Using a concluded license has multiple unwanted side effects, which is why I don't want to use them.

What do you think about implementing a way to achieve the effect of a version range also for package-configurations?

@MarcelBochtler MarcelBochtler added the to triage Issues that need triaging label Feb 11, 2025
@fviernau
Copy link
Member

For what reason are you trying to eliminate the license? (e.g use cases)

@MarcelBochtler
Copy link
Member Author

MarcelBochtler commented Feb 11, 2025

For what reason are you trying to eliminate the license? (e.g use cases)

Our ruleset disallows any usage of packages that are licensed as AGPL-3.0-or-later.

But does this really matter?
The finding is false-positive, as the file is not licensed as AGPL-3.0-or-later, so IMO I should be able to remove the finding.

@tsteenbe
Copy link
Member

This has been an annoying issue since package configurations were introduces - remember at HERE we have tens of configurations for python's setup tools. Don't get me wrong this limitations was knowingly introduced. Would be in favor to introducing configurations that work across version but would only support it if can ensure we can keep the same level of trust in ORT. As configurations overwrite scan findings and I would like be opinionated and protect users against themselves in this case by:

  • Supporting version ranges use in package configurations id only if user specifies narrow matchers (path + start_lines + line_count) so there is higher chance it only matches what it suppose to match.
  • Prior to implementation of a solution to this issue the presence of package configurations is made clear in any reports

@MarcelBochtler
Copy link
Member Author

MarcelBochtler commented Feb 11, 2025

As configurations overwrite scan findings and I would like be opinionated and protect users against themselves in this case by:

* Supporting version ranges use in package configurations id only if user specifies narrow matchers (path + start_lines + line_count) so there is higher chance it only matches what it suppose to match.

* Prior to implementation of a solution to this issue the presence of package configurations is made clear in any reports

I'm not sure if this should be required.

For curations, this is not required, and at the moment I can just add the following to my .ort.yml (or a global curation), and be done with it (when using the LicenseView.CONCLUDED_OR_DECLARED_AND_DETECTED):

curations:
  packages:
    - id: "Maven:org.ossreviewtoolkit:model:"
      curations:
        concluded_license: Apache-2.0
        comment: "..." 

But as mentioned above this is bad, and quite dangerous to do.
Which is the whole reason why I want to have the possibility to have some kind of version ranges for package configurations.

@sschuberth
Copy link
Member

What do you think about implementing a way to achieve the effect of a version range also for package-configurations?

I wanted to have this for a long time as well, also for consistency with package curations, and I believe we've talked about the wish multiple times already. I do not see a reason to not implement it finally.

@sschuberth sschuberth added configuration About configuration topics and removed to triage Issues that need triaging labels Feb 12, 2025
@fviernau
Copy link
Member

fviernau commented Feb 13, 2025

I still believe, we should not implement this and instead require the user to create a separate dedicated package configuration for each package version. And instead spent the effort in creating a helper which auto-generates a package configuration based on an already existing one. This keeps file organization simpler, package configuration simpler, slef contained, easier to review and veryify. You can automatically remove non-matching entries and so forth. Export a list of bugs for false detection fixes.

@MarcelBochtler
Copy link
Member Author

MarcelBochtler commented Feb 14, 2025

I still believe, we should not implement this and instead require the user to create a separate dedicated package configuration for each package version.

My main issue with this approach is, that this is quite a bad user experience when using ORT via CI/CI (either via ORT server, or via a GH action directly).
We are using renovate bot to update versions, but we cannot merge the PR without touching the package configurations, making it a pain to use ORT.

Remember: There is no issue here, we are trying to fix a false-positive finding.

And instead spent the effort in creating a helper which auto-generates a package configuration based on an already existing one.

The issue is not that it is hard to create a package-configuration, the issue is, that a new package-configuration it is required at all in this case.

This keeps file organization simpler, package configuration simpler, slef contained, easier to review and veryify. You can automatically remove non-matching entries and so forth. Export a list of bugs for false detection fixes.

I don't agree with every of these points, but I think that these don't justify a worse user experience.

Edit: Additionally, we already have this functionality for curations, which is already used by users as a bad fallback for this missing feature for package-configurations.

@tsteenbe
Copy link
Member

@MarcelBochtler Don't get me wrong I want to support version ranges in package configurations but it has to be implemented in the right way else it may erode user trust in ORT.

The use of a concluded license curation in a ORT scan is currently easy to spot but the use of package configuration with license finding curations is not. It's not about whether something is already possible but whether ORT is transparent about the fact license findings have been altered. When license finding curations in package configurations were implemented we deliberately made the choice to cap it what was possible as we didn't have a good way to communicate their existence throughtout an entire ORT toolchain like we have for curations.

My point is about trust in ORT results - we should make clear to anyone reading an ORT report that scanner findings / facts have been altered so in case things go wrong its easy for stakeholders to trace back when and where it happened so they can hopefully ensure it doesn't happen again. Being able to trace down things is not only useful for us ORT devs but it's a essential if you are being sued.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
configuration About configuration topics
Projects
None yet
Development

No branches or pull requests

4 participants