Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support oci zstd compression #53

Merged
merged 6 commits into from
Jan 26, 2025

Conversation

steven-halaka
Copy link

You fork of dive is much appreciated. I've been using it but had an issue reading my oci layers compressed with zstd (type=image,force-compression=true,compression-level=3,compression=zstd). Perhaps this could be of use...

Using zstd library from buildx, attempt to uncompress zstd layers.

$ go install github.com/joschi/dive@latest
go: downloading github.com/joschi/dive v0.13.1
go: github.com/joschi/[email protected] requires go >= 1.23; switching to go1.23.4
go: downloading github.com/docker/cli v27.3.1+incompatible
go: downloading github.com/docker/docker v27.3.1+incompatible

$ dive kms:local_dev
Image Source: docker://kms:local_dev
Extracting image from docker-engine... (this can take a while for large images)
cannot fetch image
could not find 'blobs/sha256/ced9e945870abc8212a94699209e5029f80466a940ca866892032f45dafecec1' in parsed layers
(read-zstd-oci-layers)]$ go install
(read-zstd-oci-layers)]$ dive kms:local_dev
Image Source: docker://kms:local_dev
Extracting image from docker-engine... (this can take a while for large images)
Analyzing image...
Building cache...

@joschi
Copy link
Owner

joschi commented Jan 16, 2025

@steven-halaka Thanks a lot for your contribution!

There still seems to be some regression with normal Docker images such as busybox:latest.

@steven-halaka
Copy link
Author

@steven-halaka Thanks a lot for your contribution!

There still seems to be some regression with normal Docker images such as busybox:latest.

Apologies, I rushed to share as soon as my use case passed.

I'll see if I can get this cleaned up with tests passing.

@steven-halaka
Copy link
Author

It seems the regression was due to the fact that both zstd and gzip io.Reader implementations don't fully validate until Read is called. I updated the brute-force detection. Alternatively, dive could perform mimetype detection using github.com/gabriel-vasile/mimetype, but I didn't want to add new dependencies to your project:

mimetype := mimetype.Detect(buffer)
switch mimetype.String() {
case "application/gzip":
	unwrappedReader, err = gzip.NewReader(unwrappedReader)
	if err != nil {
		// Not a gzipped entry
		unwrappedReader = io.MultiReader(bytes.NewReader(buffer[:n]), tarReader)
	}
case "application/zstd":
	unwrappedReader, err = zstd.NewReader(unwrappedReader)
	if err != nil {
		// Not a zstd entry
		unwrappedReader = io.MultiReader(bytes.NewReader(buffer[:n]), tarReader)
	}
}
// fallthrough to tar/json readers

It also seems that using the containerd-snapshotter was a large component of my original issue. Without it, docker engine seems to always save/export plain tar files. A workflow job using https://github.com/depot/use-containerd-snapshotter-action might be a useful regression test addition.

I also added a bit of detection for OCI images without a manifest.json (it's manifest is different). That file is not actually part of the OCI spec, but a compatibility created by docker. Instead, dive attempts to find the config blob and uses the already collected layers to generate a compatible manifest.

$ docker buildx build -f Dockerfile.test --tag test:latest --output type=oci,dest=oci.tar,force-compression=true,compression=zstd .
[+] Building 0.4s (6/6) FINISHED                                                                                                                           docker:default
 => [internal] load build definition from Dockerfile.test                                                                                                            0.0s
 => => transferring dockerfile: 96B                                                                                                                                  0.0s
 => [internal] load metadata for docker.io/library/busybox:latest                                                                                                    0.0s
 => [internal] load .dockerignore                                                                                                                                    0.0s
 => => transferring context: 134B                                                                                                                                    0.0s
 => FROM docker.io/library/busybox:latest@sha256:a5d0ce49aa801d475da48f8cb163c354ab95cab073cd3c138bd458fc8257fbf1                                                    0.1s
 => => resolve docker.io/library/busybox:latest@sha256:a5d0ce49aa801d475da48f8cb163c354ab95cab073cd3c138bd458fc8257fbf1                                              0.1s
 => CACHED [stage-0 1/1] COPY --from=busybox:latest /bin/cat /cat                                                                                                    0.0s
 => exporting to oci image format                                                                                                                                    0.1s
 => => exporting layers                                                                                                                                              0.0s
 => => exporting manifest sha256:d594fecd67c2506a62ebf0999e3b14a8a317da6dbb2cdea5f63d1cb0a9989bf1                                                                    0.0s
 => => exporting config sha256:4debdac37bf1c5483665f339316c21e9ac2bb601a44a1325a6a934d1ef03134a                                                                      0.0s
 => => sending tarball                                                                                                                                               0.0s

$ tar tvf oci.tar 
drwxr-xr-x 0/0               0 1969-12-31 19:00 blobs/
drwxr-xr-x 0/0               0 1969-12-31 19:00 blobs/sha256/
-r--r--r-- 0/0          678691 1969-12-31 19:00 blobs/sha256/3179d4230eb9b1561aa22f96125e3af6fb4cf4a91ef4532ba903db06a0ab3952
-r--r--r-- 0/0             445 1969-12-31 19:00 blobs/sha256/4debdac37bf1c5483665f339316c21e9ac2bb601a44a1325a6a934d1ef03134a
-r--r--r-- 0/0             479 1969-12-31 19:00 blobs/sha256/d594fecd67c2506a62ebf0999e3b14a8a317da6dbb2cdea5f63d1cb0a9989bf1
-rw-r--r-- 0/0             467 1969-12-31 19:00 index.json
-r--r--r-- 0/0              30 1969-12-31 19:00 oci-layout

$ tar Oxvf oci.tar blobs/sha256/3179d4230eb9b1561aa22f96125e3af6fb4cf4a91ef4532ba903db06a0ab3952 | file -
blobs/sha256/3179d4230eb9b1561aa22f96125e3af6fb4cf4a91ef4532ba903db06a0ab3952
/dev/stdin: Zstandard compressed data (v0.8+), Dictionary ID: None

Finally, here's a small suite of testing I ran locally with and without containerd-snapshotter, but I'm not sure how it best fits into the validation workflow jobs. Perhaps some new .data/ artifacts to use during unit tests? Also, make generate-test-data target using .data/Dockerfile.* doesn't seem to work.

cat >Dockerfile.test <<EOF
FROM scratch
COPY --from=busybox:latest /bin/cat /cat
EOF

for exporter in docker image; do
    for alg in uncompressed gzip estargz zstd; do
        docker image rm -f dive-validation:latest
        docker buildx build -f Dockerfile.test --tag dive-validation:latest --output type=${exporter},force-compression=true,compression=${alg} .
        go run main.go --ci dive-validation:latest
    done
done

for alg in uncompressed gzip estargz zstd; do
    docker image rm -f dive-validation:latest
    docker buildx build -f Dockerfile.test --tag dive-validation:latest --output type=oci,dest=img.tar,force-compression=true,compression=${alg} .
    go run main.go --ci --source docker-archive img.tar
done

Sorry for the wall of text, but thank you for your work on this repository.

@joschi joschi added the enhancement New feature or request label Jan 26, 2025
Copy link
Owner

@joschi joschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steven-halaka Thanks a lot for contributing this PR and going the extra mile to get it working! ❤️

I've added some acceptance tests based on your suggestions to ensure we're not running in any regressions in the future.

@joschi joschi merged commit ed867fc into joschi:main Jan 26, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants