Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification: reuse spdx and FileCopyrightText values containing XML entities #1126

Open
andreashaerter opened this issue Jan 6, 2025 · 1 comment

Comments

@andreashaerter
Copy link

Hi,

Small questions about FileCopyrightText: <text>...</text> and values containing < and >:

1) Do they need encoding? If so, there is a bug.

Example: REUSE.toml, which includes SPDX-FileCopyrightText = "foundata GmbH <https://foundata.com>". (The repository adheres to the latest REUSE specification.)

When generating an SPDX SBOM file, there are <text> tags whose values are not escaped or lack entity encoding:

$ reuse --version
reuse, version 5.0.2
[...]

$ reuse spdx -o reuse.spdx

$ cat reuse.spdx 
SPDXVersion: SPDX-2.1
DataLicense: CC0-1.0
SPDXID: SPDXRef-DOCUMENT
DocumentName: roundcube-plugin-identity-from-directory
DocumentNamespace: http://spdx.org/spdxdocs/spdx-v2.1-c4eac0dc-c74e-4557-a539-d7f54f333264
[...]
FileName: ./.gitattributes
SPDXID: SPDXRef-ee335379b1f46b4b39b73b6e6ba14c29
FileChecksum: SHA1: 18a9c4961e2e823f58f95a87995fd5cc544c8f0c
LicenseConcluded: NOASSERTION
LicenseInfoInFile: GPL-3.0-or-later
FileCopyrightText: <text>foundata GmbH <https://foundata.com></text>

I am not into the details of the SPDX 2.1 file format, so I can only guess <text> values need XML entity encoding, like replacing <https://foundata.com></text> with &lt;https://foundata.com&lt;</text>

Counterpoint: And as far as I can tell, the copyrightText in 3.0.1 allows the Range xs-string which does not need encoding? Is this correct?

2) If no encoding is needed: At least strip </text>?

Even if no entity encoding is needed, I can also put </text> into the values (e.g. SPDX-PackageSupplier = "foundata GmbH</text> <https://foundata.com>"

This results in FileCopyrightText: <text>foundata GmbH</text> <https://foundata.com></text> (as expected) which leads to validation errors at least to pyspdxtools_parser --file reuse.spdx v0.7.1) (also some kind of expected):

$ pyspdxtools_parser --file reuse.spdx 
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 30
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 37
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 44
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 51
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 58
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 65
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 72
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 79
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 86
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 93
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 100
FileCopyrightText must be one of NOASSERTION, NONE, free form text or single line of text,line: 107
Errors while parsing:  True

3) What is the recommendation?

We are used to put plain URLs into <URI> like it was recommended for plaintext Emails and Markdown since decades. But as it seems, this might be a bit problematic for SPDX-PackageSupplier. What do you recommend? Should this be mentioned in the docs?

@andreashaerter andreashaerter changed the title Clarfifcation: reuse spdx and FileCopyrightText values containing XML entities Clarification: reuse spdx and FileCopyrightText values containing XML entities Jan 6, 2025
@andreashaerter
Copy link
Author

andreashaerter commented Jan 6, 2025

This might be related to #947 and might be fixed by #394? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant