Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support inline GPG signatures for repomd.xml #1

Open
ktdreyer opened this issue Mar 22, 2021 · 49 comments
Open

support inline GPG signatures for repomd.xml #1

ktdreyer opened this issue Mar 22, 2021 · 49 comments

Comments

@ktdreyer
Copy link

DNF can optionally verify the GPG signature in a repomd.xml.asc file.

Unfortunately in many RPM distribution systems, that repomd.xml.asc file can be slightly out-of-sync from the repomd.xml file (example), either due to operator error, or the simple fact that most mirror updates are not fully atomic.

Debian faced a similar problem with their Release files' detached signatures, so they developed a new InRelease file format where the contents are inline-signed. http://www.chiark.greenend.org.uk/~cjwatson/blog/no-more-hash-sum-mismatch-errors.html

It would be great for DNF to support a new inline-signed format for repomd.xml. Maybe name it signed-repomd.xml.

@Conan-Kudo
Copy link
Member

I could see a repomd+gpgasc.xml being defined for inline GPG signed repomd.xml file.

@DemiMarie
Copy link

DemiMarie commented Aug 9, 2021

Proposed format:

<?xml version="1.0" encoding="UTF-8"?><?pgp version="1" signature="base64URL-encoded signature"?><meta>...</meta

This stuffs the signature in an XML processing instruction. The literal byte sequence <?xml version="1.0" encoding="UTF-8"?><?pgp version="1" signature=" MUST be the start of the file. The signature MUST be followed by the literal bytes "?>< followed by an ASCII letter. Therefore, an inline-signed XML file cannot include a DTD.

Base64 padding is not allowed. Unused bits in the last byte of the base64 MUST be zero. The signature MUST be v4 or later and of type 0, and MUST be encoded as a single minimum-length, old-format packet. Verifiers MUST check that the signature meets these criterion, that all of its MPIs are well-formed according to the OpenPGP standard, and that if there are any unhashed subpackets at all, there is exactly one unhashed subpacket of the form 0x9 0x10 <8-byte Key ID>. There MUST be exactly one 0x10 subpacket in the signature. All other subpackets MUST be in the hashed section. The signature MUST include an 0x33 fingerprint subpacket.

The signature algorithm MUST be at least 128 bits in strength. SHA-1, MD5, and RIPEMD160 are thus expressly forbidden and MUST be rejected.

The signature is taken over the entire file, except only the signature bytes themselves. Therefore, it includes the lead sequence <?xml version="1.0" encoding="UTF-8"?><?pgp version="1" signature=" and the subsequent "?>. No canonicalization is performed. A verifier for PGP-signed XML as described in this comment MUST verify the signature before attempting to parse the document as XML.

@jcpunk
Copy link
Contributor

jcpunk commented Aug 9, 2021

Could we perhaps get the keyid in the stanza? With this method it appears I could put multiple signatures (which sounds good to me) and having a clear way to see which keys signed a given repomd without the need to psrse the signature would be handy....

@Conan-Kudo
Copy link
Member

This stuffs the signature in an XML processing instruction. The literal byte sequence <?xml version="1.0" encoding="UTF-8"?><?pgp version="1" signature=" MUST be the start of the file. The signature MUST be followed by the literal bytes "?>< followed by an ASCII letter. Therefore, an inline-signed XML file cannot include a DTD.

We probably want to avoid having more XML documents that can't be validated. Something that @dmach and I are working on is figuring out to how fully support validating the XML documents with DTD and schema files. I don't know if we want to preclude that from this entirely. If the document is structured so that we sign everything below the DTD line, then it would probably work. We need some way to validate the correctness of the document before pulling values out of the document, or otherwise we would be permitting malformed input into something to verifying signatures.

The signature algorithm MUST be at least 128 bits in strength. SHA-1, MD5, and RIPEMD160 are thus expressly forbidden and MUST be rejected.

We have to be careful here. Specifying specific algorithms to reject will get us into thorny messes later, as we did with MD4->MD5, MD5->SHA1, and now SHA1->SHA2.

Could we perhaps get the keyid in the stanza? With this method it appears I could put multiple signatures (which sounds good to me) and having a clear way to see which keys signed a given repomd without the need to psrse the signature would be handy....

Are you asking for this to have a way to fetch the key to verify the signature without having the files locally? I'm not sure that's a use-case we want to support.

@jcpunk
Copy link
Contributor

jcpunk commented Aug 9, 2021

Are you asking for this to have a way to fetch the key to verify the signature without having the files locally? I'm not sure that's a use-case we want to support.

Perhaps something like keyid="3782CBB60147010B330523DD26FBCC7836BF353A" or keyid="36BF353A"?

@DemiMarie
Copy link

This stuffs the signature in an XML processing instruction. The literal byte sequence <?xml version="1.0" encoding="UTF-8"?><?pgp version="1" signature=" MUST be the start of the file. The signature MUST be followed by the literal bytes "?>< followed by an ASCII letter. Therefore, an inline-signed XML file cannot include a DTD.

We probably want to avoid having more XML documents that can't be validated. Something that @dmach and I are working on is figuring out to how fully support validating the XML documents with DTD and schema files. I don't know if we want to preclude that from this entirely. If the document is structured so that we sign everything below the DTD line, then it would probably work.

Allowing an unsigned DTD is an absolutely horrible idea. Inline DTDs are the majority of XML attack surface, which is why this specification explicitly forbids them. External DTDs and/or schemas are fine, but the signature checking MUST happen before any XML parsing, to prevent bugs in the XML parser from being exploitable by someone who does not have the secret part of a trusted key.

We need some way to validate the correctness of the document before pulling values out of the document, or otherwise we would be permitting malformed input into something to verifying signatures.

This specification deliberately does not require that the data being signed is well-formed XML, and that is intentional. Libxml2 is a large library with substantial attack surface, so one of my goals when writing this specification is to ensure that a document is not passed to libxml2 until the signature is verified. That’s why the specification makes no reference to the XML standard. In fact, it could theoretically be used to sign data that is not XML, although I cannot see a practical reason to do so.

The signature algorithm MUST be at least 128 bits in strength. SHA-1, MD5, and RIPEMD160 are thus expressly forbidden and MUST be rejected.

We have to be careful here. Specifying specific algorithms to reject will get us into thorny messes later, as we did with MD4->MD5, MD5->SHA1, and now SHA1->SHA2.

The purpose of this restriction is to impose a mandatory minimum strength on all implementations; I expect that the minimum may well be raised in the future. SHA-1, MD5, and RIPEMD160 are all rejected because they are too weak, not because they are listed explicitly. I included them as examples only.

Could we perhaps get the keyid in the stanza? With this method it appears I could put multiple signatures (which sounds good to me) and having a clear way to see which keys signed a given repomd without the need to psrse the signature would be handy....

Are you asking for this to have a way to fetch the key to verify the signature without having the files locally? I'm not sure that's a use-case we want to support.

Neil is correct here; this is not a use-case that should be supported.

@Conan-Kudo
Copy link
Member

This stuffs the signature in an XML processing instruction. The literal byte sequence <?xml version="1.0" encoding="UTF-8"?><?pgp version="1" signature=" MUST be the start of the file. The signature MUST be followed by the literal bytes "?>< followed by an ASCII letter. Therefore, an inline-signed XML file cannot include a DTD.

We probably want to avoid having more XML documents that can't be validated. Something that @dmach and I are working on is figuring out to how fully support validating the XML documents with DTD and schema files. I don't know if we want to preclude that from this entirely. If the document is structured so that we sign everything below the DTD line, then it would probably work.

Allowing an unsigned DTD is an absolutely horrible idea. Inline DTDs are the majority of XML attack surface, which is why this specification explicitly forbids them. External DTDs and/or schemas are fine, but the signature checking MUST happen before any XML parsing, to prevent bugs in the XML parser from being exploitable by someone who does not have the secret part of a trusted key.

Hell no to inline DTDs. They'd externally referenced ones. We can guarantee what the starting structure looks like to avoid XML parsing before signature parsing, but I want the signature to be part of the schema itself too.

We need some way to validate the correctness of the document before pulling values out of the document, or otherwise we would be permitting malformed input into something to verifying signatures.

This specification deliberately does not require that the data being signed is well-formed XML, and that is intentional. Libxml2 is a large library with substantial attack surface, so one of my goals when writing this specification is to ensure that a document is not passed to libxml2 until the signature is verified. That’s why the specification makes no reference to the XML standard. In fact, it could theoretically be used to sign data that is not XML, although I cannot see a practical reason to do so.

However, we want it to be well-formed XML.

The signature algorithm MUST be at least 128 bits in strength. SHA-1, MD5, and RIPEMD160 are thus expressly forbidden and MUST be rejected.

We have to be careful here. Specifying specific algorithms to reject will get us into thorny messes later, as we did with MD4->MD5, MD5->SHA1, and now SHA1->SHA2.

The purpose of this restriction is to impose a mandatory minimum strength on all implementations; I expect that the minimum may well be raised in the future. SHA-1, MD5, and RIPEMD160 are all rejected because they are too weak, not because they are listed explicitly. I included them as examples only.

That is not something that we specify at this level, though. We must never make it so that a consumer can't read something that already exists in the wild by default. We can certainly start off with strong algorithms up front, but keep in mind, we can literally never reject algorithms that change from being considered strong to being considered weak by default.

@DemiMarie
Copy link

This stuffs the signature in an XML processing instruction. The literal byte sequence <?xml version="1.0" encoding="UTF-8"?><?pgp version="1" signature=" MUST be the start of the file. The signature MUST be followed by the literal bytes "?>< followed by an ASCII letter. Therefore, an inline-signed XML file cannot include a DTD.

We probably want to avoid having more XML documents that can't be validated. Something that @dmach and I are working on is figuring out to how fully support validating the XML documents with DTD and schema files. I don't know if we want to preclude that from this entirely. If the document is structured so that we sign everything below the DTD line, then it would probably work.

Allowing an unsigned DTD is an absolutely horrible idea. Inline DTDs are the majority of XML attack surface, which is why this specification explicitly forbids them. External DTDs and/or schemas are fine, but the signature checking MUST happen before any XML parsing, to prevent bugs in the XML parser from being exploitable by someone who does not have the secret part of a trusted key.

Hell no to inline DTDs. They'd externally referenced ones. We can guarantee what the starting structure looks like to avoid XML parsing before signature parsing, but I want the signature to be part of the schema itself too.

Is this backwards compatible? I chose a processing instruction because I presumed that existing code would ignore it. I would be fine with the prelude being <?xml version="1.0" encoding="UTF-8"?><metadata signature-version="0x0000001" signature=", for example.

We need some way to validate the correctness of the document before pulling values out of the document, or otherwise we would be permitting malformed input into something to verifying signatures.

This specification deliberately does not require that the data being signed is well-formed XML, and that is intentional. Libxml2 is a large library with substantial attack surface, so one of my goals when writing this specification is to ensure that a document is not passed to libxml2 until the signature is verified. That’s why the specification makes no reference to the XML standard. In fact, it could theoretically be used to sign data that is not XML, although I cannot see a practical reason to do so.

However, we want it to be well-formed XML.

From my perspective, that is a separate concern.

The signature algorithm MUST be at least 128 bits in strength. SHA-1, MD5, and RIPEMD160 are thus expressly forbidden and MUST be rejected.

We have to be careful here. Specifying specific algorithms to reject will get us into thorny messes later, as we did with MD4->MD5, MD5->SHA1, and now SHA1->SHA2.

The purpose of this restriction is to impose a mandatory minimum strength on all implementations; I expect that the minimum may well be raised in the future. SHA-1, MD5, and RIPEMD160 are all rejected because they are too weak, not because they are listed explicitly. I included them as examples only.

That is not something that we specify at this level, though. We must never make it so that a consumer can't read something that already exists in the wild by default. We can certainly start off with strong algorithms up front, but keep in mind, we can literally never reject algorithms that change from being considered strong to being considered weak by default.

We need to be able to parse existing data forever, but we cannot pretend that a signature has any meaning after the underlying cryptography is broken. If we really cannot ever disable an algorithm even after it is broken, then my response will be to require SPHINCS, probably combined with Ed25519 and CRYSTALS-DILITHIUM for good measure. That means > 41KiB signatures, and I don’t think you want that.

@Conan-Kudo
Copy link
Member

Conan-Kudo commented Aug 9, 2021

If we really cannot ever disable an algorithm even after it is broken, then my response will be to require SPHINCS, probably combined with Ed25519 and CRYSTALS-DILITHIUM for good measure. That means > 41KiB signatures, and I don’t think you want that.

You can disable it if you want to, but we cannot do that by default. That's the difference. And no, I'm not crazy enough to accept something like SPHINCS.

@DemiMarie
Copy link

If we really cannot ever disable an algorithm even after it is broken, then my response will be to require SPHINCS, probably combined with Ed25519 and CRYSTALS-DILITHIUM for good measure. That means > 41KiB signatures, and I don’t think you want that.

You can disable it if you want to, but we cannot do that by default. That's the difference. And no, I'm not crazy enough to accept something like SPHINCS.

Does that mean that RPM will always have broken crypto by default, or merely that it is subject to system-wide crypto policies?

@Conan-Kudo
Copy link
Member

Conan-Kudo commented Aug 10, 2021

The latter, generally. Though right now, RPM currently defaults to ignoring system-wide crypto policies specifically to allow old packages to work. My understanding is that this will change very soon as @pmatilai is working on deprecating and disabling legacy stuff by default now.

@DemiMarie
Copy link

The latter, generally. Though right now, RPM currently defaults to ignoring system-wide crypto policies specifically to allow old packages to work. My understanding is that this will change very soon as @pmatilai is working on deprecating and disabling legacy stuff by default now.

Following system-wide crypto policies is good enough for me.

@pmatilai
Copy link
Member

Though right now, RPM currently defaults to ignoring system-wide crypto policies specifically to allow old packages to work. My understanding is that this will change very soon as @pmatilai is working on deprecating and disabling legacy stuff by default now.

I have no idea what that is supposed to mean.

Rpm doesn't default to ignoring, it simply doesn't know anything about any system-wide crypto policies and it's not configurable either. If an enforcing system-wide policy such as FIPS denies, say, SHA1 from being used, rpm will limp on the best it can. This why the enforcing policy in rpm is the strange mess it is.

@ktdreyer
Copy link
Author

I think it would be simplest to have inline-signed GPG message headers around the XML, like
http://archive.ubuntu.com/ubuntu/dists/xenial/InRelease

Then the same tools would be able to verify both files.

@ktdreyer
Copy link
Author

And it's simple to generate signatures with standard tools

@Conan-Kudo
Copy link
Member

@ktdreyer Yeah, that could work, we just need to teach everything about a repomd+gpgasc.xml file that is set up that way.

@DemiMarie
Copy link

@ktdreyer Yeah, that could work, we just need to teach everything about a repomd+gpgasc.xml file that is set up that way.

APT has had lots of security problems with inline signatures in the past, to the point that they wound up writing their own parser for them. l do not recommend this approach, but if we implement it, we should be as strict as possible:

  • CR line endings are forbidden; only LFs are allowed. Any CR in the XML must be XML-entity escaped
  • Lines that begin with -, other than -----BEGIN and -----END lines and lines that start with - (dash-escaped), are not allowed. dash-escaped lines are only allowed if the XML can validly have a line that starts with -, which I am not sure it can.
  • Lines before -----BEGIN PGP SIGNED MESSAGE----- or after -----END PGP SIGNATURE----- are not allowed.
  • A single Hash: header is required, and it must match the hash algorithm in the signature. Other headers are forbidden. Alternatively, no headers are allowed.

@DemiMarie
Copy link

On further thought, I also dislike inline signatures because they require converting LF to CRLF during verification.

@DemiMarie
Copy link

Another approach, which makes generating the signatures nearly as easy, is to stuff the base64-encoded (not armored!) signature in a comment between the XML declaration and the start of the file.

@ktdreyer
Copy link
Author

ktdreyer commented Mar 2, 2022

CentOS' publishing process broke again this week: https://pagure.io/centos-infra/issue/550 . Users have to disable repo_gpgcheck to continue.

This experience shows that a MitM attacker simply needs to delete the detached repomd.xml.asc file from mirrors or block access via HTTP, etc. Users that are accustomed to expect 404 errors are trained to disable repo_gpgcheck.

Please take the CentOS team's hard-won experience here as feedback that inline-signed repomd files are important and necessary.

@DemiMarie
Copy link

CentOS' publishing process broke again this week: https://pagure.io/centos-infra/issue/550 . Users have to disable repo_gpgcheck to continue.

This experience shows that a MitM attacker simply needs to delete the detached repomd.xml.asc file from mirrors or block access via HTTP, etc. Users that are accustomed to expect 404 errors are trained to disable repo_gpgcheck.

Please take the CentOS team's hard-won experience here as feedback that inline-signed repomd files are important and necessary.

Indeed they are. I should be able to create a format specification shortly. Where should I post it for review? I would strongly prefer to avoid the classic cleartext signature format for numerous reasons, not least of which is that it is not compatible with existing parsers as it is not well-formed XML.

@Conan-Kudo
Copy link
Member

It actually doesn't matter what the format is. The problem with CentOS is that GPG signing the repositories is done by hand by @bstinsonmhk. Red Hat's tooling simply does not support signing repositories, which has been a perennial complaint by many folks over the years.

@DemiMarie
Copy link

Do they use robosignatory?

@Conan-Kudo
Copy link
Member

I am not sure, but I don't think so. @bstinsonmhk would know.

@DemiMarie
Copy link

Is their tooling free software?

@Conan-Kudo
Copy link
Member

I assume so, I just don't know what it is.

@tyll
Copy link

tyll commented Mar 2, 2022

This experience shows that a MitM attacker simply needs to delete the detached repomd.xml.asc file from mirrors or block access via HTTP, etc. Users that are accustomed to expect 404 errors are trained to disable repo_gpgcheck.

As long as they only delete the detached signature, no harm is done. If they start changing the repository contents, how would an inline signature stop them? They could simply remove it, too, so that users disable the verification.

@ktdreyer
Copy link
Author

ktdreyer commented Mar 3, 2022

As long as they only delete the detached signature, no harm is done. If they start changing the repository contents, how would an inline signature stop them? They could simply remove it, too, so that users disable the verification.

Right, that's what I mean. It's like how users default to setenforce 0.

@Conan-Kudo
Copy link
Member

Note that even with an inline signature format, we'd still have to offer the legacy form too, just like Debian does for APT repositories.

@tyll
Copy link

tyll commented Mar 7, 2022

As long as they only delete the detached signature, no harm is done. If they start changing the repository contents, how would an inline signature stop them? They could simply remove it, too, so that users disable the verification.

Right, that's what I mean. It's like how users default to setenforce 0.

I don't follow. How are inline signatures helping here?

@ktdreyer
Copy link
Author

What I mean is that users are implicitly trained to simply disable repo_gpgcheck at the first hint of problems. "That blinking red light? Oh that happens all the time; just smash it with the hammer and it'll be fine".

@stewartsmith
Copy link
Contributor

Would a simpler approach to this problem be something like "to fetch the signature of a fetched repomd.xml, request repomd.asc-A-B where A is the SHA256 hash of the repomd.xml file, and B is the GPG key ID?

The upload has to be somewhat ordered anyway as you need all the repo metadata to hit the mirror before repomd.xml anyway, and thus ordering the signature to hit before repomd.xml should fall into the same category. It also means that any replacement of repomd.xml is atomic, as the signature is always going to be in the same place for the same content.

It doesn't solve the "users are conditioned to just disable repo_gpgcheck at the drop of a hat" problem, but it does keep all the file formats very simple. If someone can MiTM the connection or alter content on the mirror, as long as a non-signed version is supported, I don't think it's possible to not have (some) users default to this disable-the-check behavior.

@Conan-Kudo
Copy link
Member

We don't know what the GPG key ID is until we fetch the asc file, so that doesn't quite work. And I think the goal is to fetch the repomd.asc file before fetching the repomd.xml file.

@stewartsmith
Copy link
Contributor

Hrrrm.... there is the gpgkey configuration option for repos (including metadata), so we should have an idea of what ones we'd validate before fetching.

@ppisar
Copy link

ppisar commented Jan 8, 2024

repomd.asc-A-B where A is the SHA256 hash of the repomd.xml file, and B is the GPG key ID

The A field should carry a type of the hash in addition to the hash value. That helps with changing the algorithm.

@bmwiedemann
Copy link

bmwiedemann commented May 31, 2024

The A field should carry a type of the hash in addition to the hash value. That helps with changing the algorithm.

https://github.com/multiformats/multihash can do that - just add a 1220 in front of the hash to denote a 32-byte (256 bit) sha2-256 hash.

The combined xml+sig would have the advantage that only a single HTTP-request is needed, reducing latency.
However, we need to ensure that no unsigned parts can leak into the XML-processing. One way to do that could be cat repomd.xml | base64 | gpg --clearsign (or use basenc --base16 for easier decoding)

http://www.zq1.de/~bernhard/linux/repodata/repomd.xml+sig.txt has an example.

The alternative is to use a versioned detached sig using either the hash or revision value of the repomd.xml to ensure consistency of caches.

@bmwiedemann
Copy link

I think, we need both: versioned metadata (except main repomd.xml content) to fix caches and retry-logic to handle the case of a repo-update in the middle of an action.
Another approach to the second problem would be to modify the process of publishing a repo. It would be 3 stages:

  1. push new files
  2. replace files atomically
  3. delete old files after a while.

@stewartsmith
Copy link
Contributor

As a first step, documenting the existing repository metadata format isn't a bad idea.

I've fallen down that rabbit hole recently, and have started with updateinfo, see rpm-software-management/dnf5#1523

When writing the schema for updateinfo, in became apparent that no two linux distributions use the same tooling to produce it. I know of a few implementations of createrepo like systems that produce the rest of the repository metadata, so I'm guessing I'll find more surprises/quirks along the way.

It would be great if we don't end up with even more quirks spread around the distros :)

@Conan-Kudo Conan-Kudo transferred this issue from rpm-software-management/librepo Jul 30, 2024
@Conan-Kudo
Copy link
Member

I moved my rpm-metadata repo to the rpm-software-management github organization and transferred this issue to this repository.

@DemiMarie
Copy link

If we need to use OpenPGP, we could use something like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?metadata-signature -----BEGIN PGP SIGNATURE-----

base64 data, exactly 72 octets per line
-----END PGP SIGNATURE-----?>

where

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?metadata-signature -----BEGIN PGP SIGNATURE-----

and

-----END PGP SIGNATURE-----?>

are checked to be exact binary literal strings, before checking the signature and before parsing the XML.

@jcpunk
Copy link
Contributor

jcpunk commented Jul 30, 2024

It is a bit crazy, but has anyone looked at https://en.wikipedia.org/wiki/XML_Signature ?

@DemiMarie
Copy link

It is a bit crazy, but has anyone looked at https://en.wikipedia.org/wiki/XML_Signature ?

NACK to anything that uses XML Signature. It has absolutely massive attack surface and is incredibly difficult to implement correctly. The complexity of XML Signature has resulted in many vulnerabilities:

If OpenPGP must be used, I strongly recommend wrapping the signature in a comment or processing instruction. If OpenPGP can be replaced, use OpenSSH signatures instead.

@ppisar
Copy link

ppisar commented Jul 31, 2024

I strongly recommend wrapping the signature in a comment or processing instruction.

NACK. That depends on physical XML representation (whitespaces, end-of-lines, encoding, entities). Many XML parsers (e.g. libxml2) do not provide reliable access to that level.

@ppisar
Copy link

ppisar commented Jul 31, 2024

Please use standardized formats. Either in-line PGP signatures, or CMS/PKCS#7.

However, I'm not persuaded it's worth of it: If repomd.xml and the detached signature are out of sync, then files referred from repomd.xml, e.g. primary.xml, can also be out of sync, as well as RPM packages linked from them. At any rate, the repository is in inconsistent state and a consumer should avoid it and use a different mirror.

@ktdreyer
Copy link
Author

ktdreyer commented Aug 1, 2024

then files referred from repomd.xml, e.g. primary.xml, can also be out of sync

In practice, it's far more likely that the detached GPG signature file is out of sync compared to all the other files. Ordinarily the signing process is completely distinct from the createrepo process (ideally the key material is highly secure and isolated). See my note above about https://pagure.io/centos-infra/issue/550 as just one example of this happening. It's happened multiple times over the years with CentOS. See the CentOS devel list archives for other examples.

@bmwiedemann
Copy link

In openSUSE, the signing happens before the repo is pushed to the main download/rsync server. Inconsistency happens mostly due to CDN/caching proxies when expiry/purging does not happen in a timely manner.

@ppisar
Copy link

ppisar commented Aug 1, 2024

In practice, it's far more likely that the detached GPG signature file is out of sync compared to all the other files.

Likely in CentOS case. Not in general.

Ordinarily the signing process is completely distinct from the createrepo process (ideally the key material is highly secure and isolated). See my note above about https://pagure.io/centos-infra/issue/550 as just one example of this happening. It's happened multiple times over the years with CentOS. See the CentOS devel list archives for other examples.

If CentOS publishes broken repository, it's a bug in CentOS release process that needs to be fixed there.

@Conan-Kudo
Copy link
Member

In openSUSE, the signing happens before the repo is pushed to the main download/rsync server. Inconsistency happens mostly due to CDN/caching proxies when expiry/purging does not happen in a timely manner.

Well, no, this isn't the real reason why. The real reason inconsistency "rarely" happens is because Zypper fetches metalinks by requesting it through a header when doing a normal baseurl request (which only Zypper supports and there's a request to support it in librepo: rpm-software-management/librepo#251).

This has a couple of favorable consequences:

  1. We always get the repomd.xml and repomd.xml.asc from the main server, which avoids the issue of files being out of date or out of sync
  2. We can have mirror selection for every file retrieved for everything else at their own mirror sync rate

I know that this is the reason why because without this feature, the openSUSE mirrors are incredibly inconsistent and broken. Mock grew some special handling for this: rpm-software-management/mock@6aee925

@DemiMarie
Copy link

I strongly recommend wrapping the signature in a comment or processing instruction.

NACK. That depends on physical XML representation (whitespaces, end-of-lines, encoding, entities). Many XML parsers (e.g. libxml2) do not provide reliable access to that level.

My goal is to not use an XML parser before validating the signature. Instead, ignore the fact that the document is XML, and pretend it is in an strange binary format instead. The attack surface of an XML parser is too high to use it on untrusted input.

Trying to sign an XML infoset is a terrible idea. Verify the raw bytes, then parse as XML. The only purpose of the comment or processing instruction is for backwards compatibility. If that is not needed because the inline-signed file is at a separate URL, then the signature can just come before the XML and be stripped before verification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants