-
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add clarification on what a list of licenses means #349
Comments
I think there is a case to be made to even deprecate the list of licenses entirely. This started as a discussion on Slack |
Support for only SPDX license expression is not an option. There are over 2500 open source licenses and SPDX only supports about 500 or so of them. The SPDX project also does not support any commercial licenses, so having support for license names along with attaching the full text of the license is a requirement for any commercial BOM use case. Maven IMO, has the most ambiguity of any modern build system. For a license list, this is what the Maven POM XSD states:
An interesting fact is that the last sentence does not appear in their documentation. We could strengthen the documentation to read AND or OR - whichever we choose. We could also add a conjunction field to the spec so that the BOM author can specify. |
I saw this always as an "OR". like in "Here are some licenses that suite our project. Choose the one that applies in your area." collection of examples:
|
I talked to someone who thought it is AND. I don't think any of those are more correct than the other which makes it important to document what is meant. |
This is one case I have often see. On the other side of the spectrum there are collections of code that was bundled together. To stay on pypi:
I'd like to use a SPDX expression in the SBOM, but we are required to include the full license text for each license. Which is why I am using the |
The changed definition of allowed licenses from v1.4 to v1.5 raises a big problem for me (and I assume for all that somehow processes Debian packages and other sources). (Debian) Packages may have content from different authors, and each contribution to a package may come with a license or license expression. v1.4 is fine for that. But the limitation to have a list of single license items on the one hand OR a (compound?) SPDX expression on the other hand is not compatible with the real world. Even more, we do not have only "compound expressions" (what many people might think) as SPDX expressions, we might also have "simple expressions" (LicenseRef-*) to identify a license. My impression is that with v1.5 we have a significant design flaw. https://spdx.github.io/spdx-spec/v2.3/SPDX-license-expressions/ A Debian package might be split by its stanzas. and as a result I might have "file" type components instead of "libraries", in theory. But this would lead result into enormous efforts, not practical and helpful. And it does not resolve the use "simple expression" SPDX expressions. |
@Joerki , could you give a practical example for something that is not possible with today's design? |
Hi @jkowalleck ,
to do a separation with between a list or single expression I see the following issues: With the expression I don't see how to include a license text for a certain item of the expression. Example: https://metadata.ftp-master.debian.org/changelogs//main/o/openssl/openssl_3.0.15-1~deb12u1_copyright My conclusion: CycloneDX limits the use of SPDX expressions to cases where the creator has to make a conclusion for a multi-licensed component where he can choose between licenses (X OR Y) that have a known, standardized text that can be taken 1:1 from its original definition. |
i guess we can cut all of your concerns, if you had a look at #454. |
so the software itself is Apache-2, and some of its components are licensed under different licenses? or what does this file actually mean? |
Hi @jkowalleck , if you read Debians copyright specification (link is given in my comment above) you can see what the stanzas (contents of a package) contain and what important meaning the ordering of the stanzas have. Dear @stevespringett, you said in your comment:
This is wrong (at this time) and it makes me desparate when I refer to the SPDX documentation that explains what an "SPDX expression" is and why the information of the links I gave above to the relevant SPDX documentation are not adhered. So, please read and understand the SPDX documentation (at least the annex about SPDX expressions) and see that SPDX covers all kinds of licenses! SBOMs that contain proper licensing information become a very high relevance, because there are legel requirements given by the European Community CRA. And a proper license attribution is a part of it. The german Federal Office for Information Security is working on that topic: The referenced SBOM document (in English) clearly describes also how licensing shall be handled. They refer to the SPDX documentation as well. At this time it is a recommendation, but it might be possible that companies try to implement SBOM creation based on the BSI document. The software ecosystems in the world give so many different established flavors of software license approaches in their components, and the SBOM specifications need to implement that and prove their suitablity. |
We already have implemented software tools, software packages of many different software ecosystems and specifications in the world like Syft tool, the Debian copyright format, many other ecosystems with their metadata. My experience with different software ecosystems is that in case of lists scanners give items where "AND" is applicable. In a multi-licensed component we also need to distinguish between software that has different contributors (like in Linux distributions, source file/directory based, see Debian) and packages that come as a bundle (with the primary work of the authors plus software that is coming e.g. in binary form from other sources). |
I think we're getting off topic, however @Joerki I'm fully aware of the capabilities of SPDX license expressions. The reality is that SAM and ITAM systems that virtually every enterprise relies on, do not have hard requirements on commercial license identifiers being prefixed with I'm not aware of any commercial or open source license use case that CycloneDX does not support. At the present time, I believe it has the most comprehensive license support available. Please create a separate ticket for each issue if there are gaps you're seeing. However, this GitHub issue specifically was about the use of AND / OR when it comes to a list of licenses. With the limitations of SPDX expressions in mind, it may make sense to provide an option for the user to specify AND or OR thus creating their own verbose version of an expression. This would have the benefit of being able to be compatible with SPDX expressions as well as the SAM and ITAM systems that enterprises already use. |
👍 from me this would probably be the most backwards-compatible solution, spec wise. From a legal's perspective, we don't have a decision on "AND versus OR" in the current specs. And I would consider adding such a breaking change. From a tool-builder's perspective, I do not want to code a process to make irrefutable/unassailable automated license conclusion (that's what well-educated lawyers are there for); that's for the following reason, there are ecosystems that (not yet) decided the "AND versus OR" question for multi-license packages. They require the user to read the README or other documents/evidences, and come to a conclusion. (CycloneDX has all it needs to ship/document these things, too) But giving the option to make a decision is great. What do others think? Schema-implementation wise, I already envision this to be challenging, since a "AND versus OR" thing would need a "concluded/declared" property, too. But that might be a problem for future me. :) |
Hi @stevespringett , @jkowalleck
Yes, I have the entire supply chain in mind, this is what I find in my company with very diverse ecosystems, from device firmware to cloud applications, software that is created by us or by suppliers.
An example: machine-readable Debian copyright files list contributors of components together with the licenses (names and texts) and source code file sets. With V1.5 and V1.6 I don't see a chance always to map this information to an SBOM without a loss. We have already an issue that discusses some limitations: #582, #454. Another ticket is #554 that for me targets the availablility of license information in context of an expression like we have for license id/name items. It focuses the license text, but finally it makes sense to have text, url, licensing and properties for expression item(s) like we already have for id/name items. At this time acknowledgement is the only one we have both for name/id and expression item.
The SPDX SBOM specification (I take v3 as reference) has other means to express sets of simple/composite license expressions and their relation to each other: the ConjunctiveLicenseSet and DisjunctiveLicenseSet classes. My experience with 3rd party scanners (e.g. "syft" with "syft-json" format) is that they generate SBOMs with licenses as name/id list that can be interpreted as conjunctive. Disjunctive licenses have and explicit " OR " in an expression item. This is what I see in "real life".
A colleague and I were a bit shocked when we read this in the additional (non-JSON) documentation, because especially legal information needs to be precise and clear.
It is a significant fallacy that license conclusion is put ito the hands of lawyers. I attended to several demos of SCA systems including FOSSA, together with people of their company. They explained to us of how to deal with (multi-licenses), and no work was spoken about a lawyer in that context. I also had a small conversation with Philippe Ombredanne about multi-licensing and their difficulties. I do not expect that I can rely solely on a software that fully automates proper recognition and conclusion of a software, but I want to have the chance to document the (manually) identified concluded license information in a target CycloneDX SBOM based on the information from different kind of sources (SBOMs, copyright files etc.) in a way where I have no loss. |
Component.licenses has this text "EITHER (list of SPDX licenses and/or named licenses) OR (tuple of one SPDX License Expression)"
It is not made clear what a list of licenses means.
There are at least two options:
This ambiguity can be avoided using SPDX license expressions but if we get an SBOM with just a list we need to make a decision without any further information.
To be safe I would probably interpret it as AND in that case.
At least a comment should be added that this is undefined.
I would probably even go as far as saying that only a single license is allowed and if there are more an expression needs to be used.
I understand that almost all changes except a clarifying comment would be backwards breaking changes.
The text was updated successfully, but these errors were encountered: