-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Support Zstandard #519
Comments
+1 this would help to get benefits of both zstandard and lazy loading. Otherwise we have to let go one feature for the other. |
Is there any fundamental limitation blocking zstd support? It's reasonably widespread nowadays and I assume the target audience of SOCI would very much welcome it (to get even shorter container boot times). |
The limitation is that zstandard needs a lot more state. Gzip uses a 32 KiB window, so in order to resume from the middle of the file, you only need the previous 32 KiB of uncompressed data. zstandard uses variable sized windows, but the RFC (https://datatracker.ietf.org/doc/html/rfc8878#name-window-descriptor) recommends implementations support at least 8MB windows, and up to 3.75 TB windows. Facebook's implementation supports up to 2GB windows https://engineering.fb.com/2018/12/19/core-infra/zstandard/. Right now we divide images into 4 MiB spans that can be independently decompressed. With 32 KiB/span, the compression state is < 1% of the image size. If we did the same for zstd and it used 8MB windows, the index would be 2x the compressed image size 🙃. So we probably can't build a general purpose index for zstandard files that's smaller than the compressed file. I think we need to do analysis on container images in the wild to see what sort of window sizes are used - maybe they're small enough that SOCI could still be useful. Or maybe there's some way to compress the compression state to make this all work. Alternatively, there is the zstd seekable format: https://github.com/facebook/zstd/blob/v1.5.6/contrib/seekable_format/zstd_seekable_compression_format.md. It works pretty much the same way as stargz (https://github.com/containerd/stargz-snapshotter/blob/main/docs/estargz.md) which means it either requires conversion or build time support. Overall, zstandard is harder to index than gzip and isn't possible to do efficiently in all cases. We're interested in hearing more use cases to help get this work prioritized. If can share any public zstd images that you're interested in, that would be helpful when we do feasibility analysis. |
Thanks! Your comment gives a lot of clarity about the trade-offs involved here. I'll stick to the zstd-variant of eStargz for the time being and see how far that gets us. |
Description
Zstandard is an alternative compression format to gzip that gets faster compression/decompression speeds for the same compression ratio. The containerd community is in the early phases of adoption, but we already have runtime support in containerd and build time support through buildkit.
Customers are reporting both smaller images and faster launches.
Describe the solution you'd like
We should consider implementing
ZInfo
for zstandard so that customer can get both the speedup benefit from zstandard and lazy loading.Describe any alternative solutions/features you've considered
No response
Any additional context or information about the feature request
The OCI Image-spec added zstd as a layer media type suffix in 2019 opencontainers/image-spec#788
Containerd has supported running zstd images since 2020 containerd/containerd#4809
Blog posts about the speedups:
https://aws.amazon.com/blogs/containers/reducing-aws-fargate-startup-times-with-zstd-compressed-container-images/
The text was updated successfully, but these errors were encountered: