Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy: Allow copying images with signatures #2590

Closed
wants to merge 1 commit into from

Conversation

ckyrouac
Copy link

@ckyrouac ckyrouac commented Oct 3, 2024

This error path was taken even when the destination supports signatures. The call to sourceSignatures() will return an error when the destination does not support signatures, so an error should not be thrown when len(sigs) > 0.


I'm not totally sure of the implications of removing this error path, but I tested this with skopeo and I am able to copy an image with a signature after removing this if statement.

This would fix containers/bootc#812

This error path was taken even when the destination supports
signatures. The call to `sourceSignatures()` will return an error
when the destination does not support signatures, so an error should not
be thrown when len(sigs) > 0.

Signed-off-by: Chris Kyrouac <[email protected]>
Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cc: @Honny1


@ckyrouac I’d recommend starting with the hypothesis that the error text reported in containers/bootc#812 is correct about the immediate cause of the error. Then describe inputs and desired outputs (WRT location, representation, compression, signature presence, signature author).

@ckyrouac
Copy link
Author

ckyrouac commented Oct 3, 2024

@mtrmac yes, I am certain the error reported in the bootc issue is correct. Behind the covers, the bootc code is simply shelling out to podman image push. The image source is user supplied and the destination is constructed by the bootc code. An example command the bootc code constructs is

podman image push "registry.redhat.io/ubi9/podman:latest" "containers-storage:[overlay@/run/bootc/storage+/proc/self/fd/3]registry.redhat.io/ubi9/podman:latest"

A similar version of this using skopeo that fails with the same error:

skopeo copy "containers-storage:[overlay@/var/lib/containers+/run/user/1000/containers]registry.redhat.io/ubi9/podman:latest" "containers-storage:[overlay@/var/lib/containers+/run/user/1000/containers]localhost/ubi9/podman:latest"

The goal is essentially a 1:1 copy of the image. For the bootc use case, this is used to "bind" an image to a bootc disk image, i.e. copy the container image from the host that is creating the bootc disk image into the bootc disk image.


That aside, if we ignore the bootc specific stuff and just look at the code in this repo, this code checks if the destination supports signatures. If the destination does not support signatures, it throws an error, if it does then the number of signatures is returned. Then immediately after, this code checks if there are any signatures and eventually throws an error. This seems like a bug, or am I missing something?

@cgwalters
Copy link
Contributor

cgwalters commented Oct 4, 2024

At a high level, it's pretty rare in general AFAIK to copy between c/storage instances. It'd be less rare if we had a nice flow for cross-root-unpriv copies - but I personally struggle to think of a use case where it'd commonly happen other than what we're doing in bootc (copying from the ambient storage into a new target disk image).

My guess as to why this is happening is because c/storage pull is inherently lossy in that we cannot guarantee we can reproduce the exact input manifest (which is what's signed) because of compression. I'll note here that ostree which was inspired by git doesn't checksum compressed output basically for this reason...maybe we could even try to get into OCI+signatures a manifest variant which allows creating an alternative signature that covers just the diffids or so... I now feel it was a big mistake on my part to not participate in the original OCI standardization process, maybe we could have done something more than just slightly tweaked and rebranded the docker schemas.

Anyways, in this instance, I don't think we need to verify the signature in theory - we pulled it, so we can trust it. So the right fix may just be to --remove-signatures in the short term.

As long as we're still running through the image policy for subsequent fetches (normally from a registry) I think that's the important bits covered.

Copy link
Contributor

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking as needs-work based on:

  • We're still debating the architecture here; it's possible that this makes sense to do, but it is security sensitive and needs some careful analysis
  • If we did decide to do this, we should have a test case at least

@mtrmac
Copy link
Collaborator

mtrmac commented Oct 4, 2024

Then immediately after, this code checks if there are any signatures and eventually throws an error. This seems like a bug, or am I missing something?

It’s not “eventually”; it’s “eventually”, if we determine that we need to edit the manifest. And then we need to think about why are we modifying a manifest of a signed image.


My guess as to why this is happening is because c/storage pull is inherently lossy in that we cannot guarantee we can reproduce the exact input manifest (which is what's signed) because of compression.

Yes, that’s the reason-behind-the reason here.

The immediate cannotModifyManifestReason checks are correct: the signature is over the manifest which refers to the compressed representation, and changing the manifest invalidates the signature. Now, it is defined to not be an error for an image to be accompanied by a signature which does not match the manifest, but it is also not useful at all. So, this error path is making that explicit to the caller: Either arrange the copy so that the manifest can be preserved (and then the signature can be usefully copied), or use --remove-signatures (and be aware that the signatures will be dropped).


And now, why a storage-to-storage copy triggers a manifest update: because the on-registry representation is compressed, and the in-c/storage represents is uncompressed. We can’t recreate the compressed representation from the uncompressed one, so a copy from c/storage (podman image push) is, conceptually, always editing the manifest to either produce a manifest with uncompressed data, or to produce a manifest with a possibly-different, newly-compressed layer representation.

Except … if we were doing this precisely correctly in the above sense, every pull would also require --remove-signatures, because on a pull we decompress the layers and thus we should update the manifest. For better or worse, the current implementation is such that this decompression is invisible to the generic copy code, and we almost always (#1768) store the original manifest into c/storage, along with the original (~still matching) signatures — and we compensate this when reading from c/storage, by triggering a manifest update at that later time.

And that’s what’s failing here: the signed image is pulled, along with the signatures and the original manifest; but the subsequent storage-to-storage copy triggers the delayed manifest update and signature invalidation.


I don’t know what’s a good way to handle this. The “pulls don’t edit a manifest” special case might not be strictly necessary, historically https://github.com/mtrmac/image/tree/original-manifest proposed explicitly storing the original manifest + an updated one in c/storage, instead. But even in that case, we would not use “the original manifest” for any further copies from c/storage, because we can’t recreate the compressed representation.

Generally turning off the canModifyManifest check, as this PR does, would, I think, only lead to harder-to-diagnose signature validation failures down the line.

Somehow special-casing the c/storage-to-c/storage copy to always preserve the original manifest and all signatures is would be an extra special case in something related to image integrity, and it seems to me that it would effectively “inherit trust” between the two c/storage repos — an image which does not match the manifest+signatures in the source repo would also be stored not matching manifest+signatures in the destination; and the c/image code (due to layer reuse / image deduplication) strongly relies on the c/storage repo being trusted from the point of the invoking user. Generally, in skopeo copy a:… b:…, the copy is expected to fail if the image in a is inconsistent, and to only write a consistent image to b (in particular, when using a digest reference or having a signature policy at the source, and copying to a locally-trusted destination, this is a way to be guaranteed to get a consistent image to work with). Having a skopeo copy containers-storage:… containers-storage:…, instead, assume the source is fully trusted, would be totally unexpected.

I suppose it would be possible to copy the image using podman, and then manually invoke c/storage to also copy the related manifest+signatures, in a that special situation situation where the source is known to be trusted. That’s probably excessive; I there might not be that many good uses for the signatures.


Perhaps much more relevant than the wall of text above: Even with --remove-signatures, the c/storage-to-c/storage copy is going to edit the manifest to refer to uncompressed layers, and thus the digest will change, and digest references to the image will break in the destination.

@cgwalters
Copy link
Contributor

relies on the c/storage repo being trusted from the point of the invoking user.

So here's my strongly held opinion: JSON is OK, most of the OCI specs are OK. sigstore/cosign is OK. The thing that's not ok is tar. But there's a fix for the woes of tar: composefs. With the design sketched out in containers/composefs#294 (still needs to be proven) we don't need to trust tar (the diffids or the compressed version) - we only need to trust the composefs digest in the manifest - which operates BTW on uncompressed data (as it should) so there's no "compression digest" problem. Even better again as I was saying in the "local copy" case the enormous advantage of fsverity is that trust is rooted in the Linux kernel - so there's no "caller needs to trust what it's reading" if we have that - it's just dramatically better than parsing and re-checksumming a tar stream.

But anyways...

Even with --remove-signatures, the c/storage-to-c/storage copy is going to edit the manifest to refer to uncompressed layers,

Ah. Hmmm...it looks like our logically bound image tests today are just referencing images by tag, so I guess that's why we're not seeing it there, though we're also not enforcing signatures for at least one of those images.

Do you consider this a bug? It...feels like a bug. When wouldn't we want to copy the "original manifest" alongside the unpacked manifest?

Having a skopeo copy containers-storage:… containers-storage:…, instead, assume the source is fully trusted, would be totally unexpected.

Yeah, OK in ostree we made the decision that ostree pull-local (the equivalent of what's happpening here) does assume the input is trusted. But you can specify --untrusted to force ostree to re-checksum all the inputs, etc. I guess we should add podman image push --trust-source and the generalized version in skopeo copy "containers-storage:[overlay@path+runroot:trusted=true]registry.redhat.io/ubi9/podman:latest" ... (note trusted=true)?

@ckyrouac
Copy link
Author

ckyrouac commented Oct 4, 2024

@mtrmac thanks for the detailed response! I'll close this out. After I wrap my head around this more, I'll try to open a follow up PR to add some docs/comments to clear this up for anyone else reading the code.

@ckyrouac ckyrouac closed this Oct 4, 2024
@mtrmac
Copy link
Collaborator

mtrmac commented Oct 4, 2024

we only need to trust the composefs digest in the manifest - which operates BTW on uncompressed data (as it should)

OTOH it’s much less risky to sign the top-level representation; signing a derived value can dramatically expand the amount of code that needs to be hardened and used before we get a “content” digest to verify, incl. decompression at least, and maybe the composefs metadata builder.


Even with --remove-signatures, the c/storage-to-c/storage copy is going to edit the manifest to refer to uncompressed layers,

Ah. Hmmm...it looks like our logically bound image tests today are just referencing images by tag, so I guess that's why we're not seeing it there, though we're also not enforcing signatures for at least one of those images.

Do you consider this a bug? It...feels like a bug.

I think that’s a semantics question — does c/storage store “consistent images” (if so, its manifest should all refer to uncompressed data) or “a faithful representation of registry contents” (incl. original manifests). Neither of the two extremes are actually useful in practice… (we don’t store the compressed blobs, so “faithful” is not happening; “consistent images” would break digest references using on-registry digests, and make signatures unverifiable). so here we are.

It’s would be elegant to do it like containerd, and to store the original compressed representation on local disk as well. But the cost seems unreasonable to me, especially for compute nodes that never do builds from that c/storage.

When wouldn't we want to copy the "original manifest" alongside the unpacked manifest?

Without ever seeing the compressed versions and decompressing them ourselves (or, usually, having a trusted record of having done that before in BlobInfoCache), we have no way to tell whether the “original manifest” matches the layers; an attacker could attach a nice trusted signed “original” manifest to malicious contents. What does it mean to copy that along?


Yeah, OK in ostree we made the decision that ostree pull-local (the equivalent of what's happpening here) does assume the input is trusted. But you can specify --untrusted to force ostree to re-checksum all the inputs, etc. I guess we should add podman image push --trust-source and the generalized version in skopeo copy "containers-storage:[overlay@path+runroot:trusted=true]registry.redhat.io/ubi9/podman:latest" ... (note trusted=true)?

It’s not just the source — as we work with the decompressed layers, we would automatically update the manifest anyway. If we were to add an option, it would be “ignore all layer changes and use the original config+manifest in the destination” — and that would be in c/image/copy.Options, not in the transport. Probably that option would be heavily gated (do not allow compression changes? only allow when writing to c/storage? or, in the extreme, only allow a very specific copy.Options structure with no other options set at all?) but it seems … viable? I wouldn’t rule it out now — but that’s a guess, there’s some chance the implementation would be so bad that it would not be maintainable; or maybe there’s some implementation snag to the idea.

@cgwalters
Copy link
Contributor

Probably that option would be heavily gated (do not allow compression changes? only allow when writing to c/storage?

Yes, I think this is only about writing to c/storage. A lot of good discussion here, I filed #2599 to track this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to bind a signed image at install time
3 participants