Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support zstd compression in image commit #5452

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

aaronlehmann
Copy link
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

Without this change, specifying Compression: imagebuildah.Zstd in imagebuildah's `BuildOptions fails, so it is not possible to push cache to a registry with zstd compression.

https://github.com/opencontainers/image-spec/blob/main/media-types.md defines a media type for zstd-compressed layers, so I don't think the comment about lack of a defined media type is still applicable.

How to verify it

Specify Compression: imagebuildah.Zstd when building an image using imagebuildah.BuildDockerfiles,.

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

n/a

Does this PR introduce a user-facing change?

None

Without this change, specifying `Compression: imagebuildah.Zstd` in
`imagebuildah`'s `BuildOptions fails, so it is not possible to push
cache to a registry with zstd compression.

https://github.com/opencontainers/image-spec/blob/main/media-types.md
defines a media type for zstd-compressed layers, so I don't think the
comment about lack of a defined media type is still applicable.

Signed-off-by: Aaron Lehmann <[email protected]>
@openshift-ci openshift-ci bot added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 1, 2024
Copy link
Contributor

openshift-ci bot commented Apr 1, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: aaronlehmann
Once this PR has been reviewed and has the lgtm label, please assign lsm5 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@aaronlehmann
Copy link
Contributor Author

aaronlehmann commented Apr 1, 2024

If a test is required for this change, I'd appreciate suggestions about how to best add test coverage. It doesn't seem possible to test via the CLI because the Compression parameter isn't wired up as a CLI flag.

	if c.Flag("compress").Changed {
		logrus.Debugf("--compress option specified but is ignored")
	}

	compression := define.Gzip
	if iopts.DisableCompression {
		compression = define.Uncompressed
	}

@rhatdan
Copy link
Member

rhatdan commented Apr 2, 2024

@giuseppe @mtrmac @nalind PTAL

@rhatdan
Copy link
Member

rhatdan commented Apr 2, 2024

@flouthoc PTAL

// how to decompress them, we can't try to compress layers with zstd.
return "", "", errors.New("media type for zstd-compressed layers is not defined")
omediaType = v1.MediaTypeImageLayerZstd
dmediaType = "application/vnd.docker.image.rootfs.diff.tar.zstd"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by comment: I'm uncomfortable with hardcoded magic strings. Shouldn't this be defined as a constant in c/image?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if this should happen, it would be in c/image/v5/manifest.

(Buildkit does seem to use this constant, but it is not a part of the public API.)

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must refuse to create v2s2 images with zstd compression.

AFAICS that means that the whole of NewImageSource needs to be restructured around respecting i.preferredManifestType, instead of always building data in both formats.

There may be various ways to do it:

  • Add some kind of “manifest type” abstraction, and call it everywhere the code currently computes both values
  • Continue building both formats all the time, but somehow represent “this format failed”, and check that at the very end (might be less invasive, but also rather more convoluted)
  • Maybe some other design.

@@ -155,9 +155,9 @@ func computeLayerMIMEType(what string, layerCompression archive.Compression) (om
// how to decompress them, we can't try to compress layers with xz.
return "", "", errors.New("media type for xz-compressed layers is not defined")
case archive.Zstd:
// Until the image specs define a media type for zstd-compressed layers, even if we know
// how to decompress them, we can't try to compress layers with zstd.
Copy link
Contributor

@mtrmac mtrmac Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is accurate: Docker images can’t represent Zstd images layers.

We can’t very well just silently ignore that problem and create something that doesn’t work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to defer to you and refuse to create schema2 images with zstd compression. But I'm curious to understand why this is a problem - BuildKit seems happy to create them using this media type, and they seem to work fine with Docker and kaniko (but not buildah...).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One place is our own https://github.com/containers/image/blob/1c648b405e0763472db0bcc189136ddb6a5b61c6/manifest/docker_schema2.go#L173 . In retrospect, being that strict might have been a mistake, but either way that code refusing unexpected MIME types has been there for many years.

(IIRC, Docker is not checking MIME types at all when pulling; that’s the other extreme option.)

So we would need to think about a migration mechanism, and giving users enough tools to manage it. For a format feature which is, AFAICS, undocumented. Meanwhile, “if you want to use Zstd, migrate to OCI” is a simple story that works. for all implementations.

@aaronlehmann
Copy link
Contributor Author

We must refuse to create v2s2 images with zstd compression.

AFAICS that means that the whole of NewImageSource needs to be restructured around respecting i.preferredManifestType, instead of always building data in both formats.

To keep the change relatively contained, what would you think about returning an empty dmediatype from computeLayerMIMEType, and making NewImageSource return an error if the media type corresponding to manifestType is empty. Something like:

		// Figure out if we need to change the media type, in case we've changed the compression.
		omediaType, dmediaType, err = computeLayerMIMEType(what, i.compression)
		if err != nil {
			return nil, err
		}
		// The layer media type must be defined for the type of manifest we are producing.
		if (omediaType == "" && manifestType == v1.MediaTypeImageManifest) ||
			(dmediaType == "" && manifestType == manifest.DockerV2Schema2MediaType) {
			return nil, fmt.Errorf("manifest type %q does not support selected compression format", manifestType)
		}

Refactoring with a manifest type abstraction seems like a worthy idea as well. I can't see a reason why this function is constructing both in parallel (not sure if I'm missing something). I'm not sure I'm the best person to do this refactor though, as I'm brand new to the codebase.

@mtrmac
Copy link
Contributor

mtrmac commented Apr 2, 2024

I’ll let actual Buildah maintainers to decide how this should be organized.

@aaronlehmann
Copy link
Contributor Author

@rhatdan: How would you like to proceed here?

@aaronlehmann
Copy link
Contributor Author

Hoping for some direction on where to go with this PR. I'm happy to make the changes if the maintainers have thoughts on what makes most sense.

@mtrmac
Copy link
Contributor

mtrmac commented Apr 11, 2024

My preference would be

Add some kind of “manifest type” abstraction, and call it everywhere the code currently computes both values

but that’s the option that requires more work, and I’m not authoritative to commit Buildah to that direction, nor can I spend very much time helping with that work.

@aaronlehmann
Copy link
Contributor Author

@rhatdan: Still looking for guidance on this one...

@rhatdan
Copy link
Member

rhatdan commented May 3, 2024

@giuseppe @nalind PTAL

@aaronlehmann
Copy link
Contributor Author

@giuseppe @nalind: Any thoughts on this one? I'm happy to restructure it as necessary, but I'm looking for direction on this.

@aaronlehmann
Copy link
Contributor Author

@giuseppe @nalind @rhatdan: I'd really like to get this upstreamed; can you please let me know how you'd like me to approach the change?

@giuseppe giuseppe added the No New Tests Allow PR to proceed without adding regression tests label Jun 21, 2024
@giuseppe
Copy link
Member

could you please define "application/vnd.docker.image.rootfs.diff.tar.zstd" in containers/image/v5/manifest as @mtrmac pointed out?

@mtrmac
Copy link
Contributor

mtrmac commented Jun 21, 2024

could you please define "application/vnd.docker.image.rootfs.diff.tar.zstd" in containers/image/v5/manifest as @mtrmac pointed out?

@giuseppe To this day we are explicitly rejecting unknown MIME types for v2s2 layers. (I could see an argument that that has been a mistake, but here we are anyway.) Any v2s2 image with zstd are not going to be accepted by any currently existing version of Podman.

Why should we start creating such images now, instead of using the OCI format exclusively?

@giuseppe
Copy link
Member

@mtrmac ok thanks, that makes sense.

@aaronlehmann can you use the OCI format?

@aaronlehmann
Copy link
Contributor Author

@giuseppe: Using the OCI format is completely fine for me; I don't care about the Docker format. However, the issue is that NewImageSource constructs OCI and Docker manifests simultaneously, and it's not clear what to do for the Docker format if we're using zstd compression. Do I need to refactor the code to make it possible to produce either type of manifest separately?

@aaronlehmann
Copy link
Contributor Author

@giuseppe: Any thoughts?

@giuseppe
Copy link
Member

@mtrmac do you have any suggestions for @aaronlehmann ?

@mtrmac
Copy link
Contributor

mtrmac commented Jun 28, 2024

… so we have come full circle :)

My preference is #5452 (comment) , to split the manifest creation code into per-format implementations, and to only use one instead of both.

@rhatdan
Copy link
Member

rhatdan commented Sep 13, 2024

@aaronlehmann still working on this?

@aaronlehmann
Copy link
Contributor Author

@rhatdan: Happy to come back to this if I can get some clarity over the direction to go for the solution. Would something like #5452 (comment) be merged, or do we need to completely refactor with a "manifest type" abstraction as described in #5452 (comment)?

@rhatdan
Copy link
Member

rhatdan commented Sep 16, 2024

If miloslav prefers 5452, I would go along with him. He is far smarter then me.

aaronlehmann added a commit to aaronlehmann/buildah that referenced this pull request Sep 17, 2024
Currently, NewImageSource creates a Docker schema2 manifest and an OCI
manifest at the same time. This precludes functionality that isn't
supported by both manifest types, for example zstd compression.
Refactoring this to create only the desired manifest type solves this
and also cleans up the code by separating manifest-type-specific code
into distinct implementations of a "manifest builder".

See discussion in containers#5452.
aaronlehmann added a commit to aaronlehmann/buildah that referenced this pull request Sep 17, 2024
Currently, NewImageSource creates a Docker schema2 manifest and an OCI
manifest at the same time. This precludes functionality that isn't
supported by both manifest types, for example zstd compression.
Refactoring this to create only the desired manifest type solves this
and also cleans up the code by separating manifest-type-specific code
into distinct implementations of a "manifest builder".

See discussion in containers#5452.
@aaronlehmann
Copy link
Contributor Author

Thanks. I've created #5743 to abstract the manifest handling instead of trying to build both simultaneously in NewImageSource. Once this is reviewed/merged, I can update this PR to add zstd compression support.

aaronlehmann added a commit to aaronlehmann/buildah that referenced this pull request Sep 17, 2024
Currently, NewImageSource creates a Docker schema2 manifest and an OCI
manifest at the same time. This precludes functionality that isn't
supported by both manifest types, for example zstd compression.
Refactoring this to create only the desired manifest type solves this
and also cleans up the code by separating manifest-type-specific code
into distinct implementations of a "manifest builder".

See discussion in containers#5452.

Signed-off-by: Aaron Lehmann <[email protected]>
aaronlehmann added a commit to aaronlehmann/buildah that referenced this pull request Sep 17, 2024
Currently, NewImageSource creates a Docker schema2 manifest and an OCI
manifest at the same time. This precludes functionality that isn't
supported by both manifest types, for example zstd compression.
Refactoring this to create only the desired manifest type solves this
and also cleans up the code by separating manifest-type-specific code
into distinct implementations of a "manifest builder".

See discussion in containers#5452.

Signed-off-by: Aaron Lehmann <[email protected]>
@aaronlehmann
Copy link
Contributor Author

@rhatdan: This is waiting for review of #5743 now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. No New Tests Allow PR to proceed without adding regression tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants