Support remotely-mountable layers for speeding up image distribution #644

ktock · 2020-06-09T08:24:47Z

Recently I read through the codes and came up with the strategy for enabling remote-snapshotter-like functionality in libpod. This is a discussion thread for the design of the low-level part of this functionality.

Creating layers without pulling the content means that Store.ApplyDiff API won't be called for this layer. So during the call for Store.CreateLayer API, we need to check the existence of the targeting layer content in the graphdriver. If the content exists we need to commit it immediately like done in ApplyDiff.

Creating layers without ApplyDiff

Corresponding to the commit (smaller one): d174476

To achieve this, storage.Store will need an additional calling convention with graphdrivers for asking if the targeting layer contents can be provided by graphdrivers (i.e. remotely-mountable). In this patch, I extended the semantics of Driver.Create API. As mentioned in another PR(...), additional information for that layer is passed from ImageDestination to Store.CreateLayer API using storage.LayerOptions.Labels option. Then Store.CreateLayer passes these labels to Driver.Create API using driver.CreateOpts.StorageOpt, for asking if the targeting layer is remotely-mountable.

The graphdrivers (e.g. CVMFS-graphdriver or stargz-graphdriver, etc) can search the remote mount points using passed StorageOpts. If the layer is remotely mountable, graphdrivers tell so to the store. In this patch, I introduced a typed error ErrTargetLayerAlreadyExist for this. When the store gets ErrTargetLayerAlreadyExist, it immediately registers this layer in the same way as done in Store.ApplyDiff API.

Working example of the graphdriver which supports remotely-mountable layers

Corresponding to the commit (bigger one): b139d87

As a working example, I implemented a stargz-snapshotter-based graphdriver and it's also included in this patch. This graphdriver leverages stargz snapshotter for mounting stargz layers and managing the raw contents and mountpoints. You can refer to the snapshotter implementation on https://github.com/ktock/stargz-snapshotter/tree/graphdriver.

TODOs

discussing the design of "creating layers without ApplyDiff" towards an agreement.
discussing the graphdriver implementation
adding more tests

rhatdan · 2020-06-09T14:33:15Z

@giuseppe @mtrmac @nalind @vrothberg PTAL

ktock · 2020-06-17T02:25:10Z

Based on the discussion in containers/podman#4739, I rethought the design of the remote snapshot functionality to leverage additional layer stores. What I've done here is to add a "layer discovery" functionality for the store.

Layer discovery for additional layer store

Corresponding to the commit (smaller one): 7e66198

For some filesystems including stargz-based one, recognizing all available layers and storing the exhaustive list of *store.Layer in the additional store in advance is difficult. We need something like "layer discovery" functionality here, which allows clients (e.g. *storageImageSource.TryReusingBlob) to tell the store which layers they want, with some additional information (e.g. layer digest, diffID, image reference, etc). This allows the store to discover the specified layers from remote stores and to add the corresponding *store.Layer information to the list in the additional layer store. Then the later calls to the store APIs can recognize these layers.

I've made use of store.CreateLayer API for achieving this. Clients (e.g. *storageImageSource.TryReusingBlob) can call this API with targetting layer's information (e.g. layer digest, diffID, image reference, etc) as labels. These labels will be passed to the underlying graphdriver through StorageOpts. Then this graphdriver discovers the specified layer from the remote store. If that layer exists, the graphdriver adds the corresponding layer information (*store.Layer) to the additional layer store then returns ErrTargetLayerAlreadyExist. Clients can use this layer in the later API calls and of course, can skip the pull of this layer.

Working example of the graphdriver which supports remotely-mountable layers

Corresponding to the commit (bigger one): f8dea83

As a working example, I implemented a stargz-snapshotter-based graphdriver and it's also included in this patch. This graphdriver leverages stargz snapshotter for mounting stargz layers and managing the raw contents and mountpoints. You can refer to the snapshotter implementation on https://github.com/ktock/stargz-snapshotter/tree/graphdriver.

rhatdan · 2020-06-24T12:42:51Z

@ktock Could you fix the lint issues. Most of the engineering team has been tied up in working on Podman 2.0 we hope to get back to this soon.

ktock · 2020-06-25T00:56:49Z

Thanks for your time. It seems to be Go version's issue. Do we have plan to move to a newer version of go(1.13)?

could not import github.com/containerd/containerd/errdefs (/go/src/github.com/containers/storage/vendor/github.com/containerd/containerd/errdefs/errors.go:54:16: Is not declared by package errors

rhatdan · 2020-06-25T10:08:10Z

Yes, it is time we moved to golang 1.13

Signed-off-by: Kohei Tokunaga <[email protected]>

ktock · 2020-06-29T12:50:05Z

The completions of following tests haven't been reported to this PR page but tests (except for devicemapper ones) seem to have passed 🤔

https://cirrus-ci.com/build/5193639385104384

- TEST_DRIVER:fuse-overlay VM_IMAGE:${FEDORA_CACHE_IMAGE_NAME} image:fedora-32-libpod-6508632441356288
- TEST_DRIVER:fuse-overlay VM_IMAGE:${PRIOR_FEDORA_CACHE_IMAGE_NAME} image:fedora-31-libpod-6508632441356288

I think we can at first focus on the layer discovery part (commit 56035c0), then can proceed to the graphdriver implementation part (commit f98ab81).
I included these two parts in this single PR for the ease of the discussion based on the working example but we can consider to separate the PR and have a dedicated one for the second part after the first one reached an agreement.

giuseppe · 2020-06-30T10:49:38Z

drivers/containerd/containerd.go

@@ -0,0 +1,993 @@
+// +build linux


is there any code coming from containerd and we need to maintain the copyright header?

No.
And before I waste your time...

I don't intend to get graphdriver part (f98ab81) merged into libpod immediately. At this moment, it's just an example implementation and it's incomplete at some points,

f98ab81 relies on stargz-snapshotter but it will require users to run additional daemons (=containerd+stargz-snapshotter) and it might not fit to the podman's design phirosophy?

we might need a daemonless implementation of this patch.

stargz-snapshotter is untested in the rootless world. (so the patch works only with root user)

some implementation details are imcomplete yet (e.g. the way to treat id mappings, lockfiles, ...)

The patch f98ab81 intends to make sure that "layer discovery" functionality (56035c0) is needed to make remote-snapshotter-like functionality come true in podman.
I think the deeper review (or design discussion) of the graphdriver implementation can be after the "layer discovery" part (56035c0) is done (But it's kinda chicken and egg problem...).

Please tell me if I should separate the PR.

rhatdan · 2020-08-03T13:03:20Z

@giuseppe @ktock Any movement on this PR?

giuseppe · 2020-08-04T10:11:01Z

@ktock I wonder if this would be easier to implement adding a mechanism similar to additionalStores.

Instead of expecting the store it is just the path to the exploded layers, as:

additional_remote_paths = ["/foo/bar/"] and expect the layer to be accessible at /foo/bar/$ID.

Or better, we could have it configurable and define a few substitution, e.g. additional_remote_paths = ["/foo/bar/%L"], additional_remote_paths = ["/foo/bar/%I/%L"] wher %I and %L are the checksums for the image and the layer.

Do you see any problem with this mechanism and CRFS?

ktock · 2020-08-25T16:11:39Z

@giuseppe Querying layers to registries requires at least the image reference name (e.g. docker.io/library/ubuntu) and the layer digest. So /foo/bar/%I/%L seems fine for querying a layer on the registry but %I should be the image reference name (or its base64 encoded string, etc) rather than checksum or hash of the image.

(Maybe in following-up patches => ) Furthermore, we'll need a verification functionality for each lazily-fetched chunk (stargz-snapshotter is starting to support it). The OCI digest of a layer is required for querying it on the registry but can't be used for verifying each lazily-fetched file data so we'll need to pass additional information (e.g. the OCI digest of stargz TOC contained in the manifest as a layer annotation) to the underlying filesystem, e.g. /foo/bar/%I/%L/%T where %I = (base64 encoded) image ref, %L = layer digest and %T = the digest of stargz TOC. Or we should have a more flexible way of making the filepath format configurable.

siscia · 2020-10-08T09:49:49Z

Why base64 encoded? To avoid slashes in the name?

In our experience the complex part is not really having the layers accessible, but to create all the accessory file system structure next to it, like the layer.json file and the whole directory structure necessary.

If we could get along with just providing the layers in the filesystem it would be great!

Info gather by our GSoC student: https://medium.com/@mohit2501tyagi/lets-walk-through-podman-37636cf223c5

ktock · 2020-12-26T16:00:30Z

Closing in favor of containers/podman#8837, #795 and containers/image#1109 .

ktock marked this pull request as draft June 9, 2020 08:24

This was referenced Jun 9, 2020

Support remotely-mountable layers for speeding up image distribution containers/image#956

Closed

Remote snapshotter in podman containers/podman#4739

Closed

ktock force-pushed the remote branch 4 times, most recently from 0f967c2 to f8dea83 Compare June 17, 2020 02:12

ktock added 2 commits June 29, 2020 18:55

Enable to pass additional labels to the graphdrivers from create API

56035c0

Signed-off-by: Kohei Tokunaga <[email protected]>

Add containerd-based graphdriver leveraging additional stores

f98ab81

Signed-off-by: Kohei Tokunaga <[email protected]>

ktock force-pushed the remote branch from f8dea83 to f98ab81 Compare June 29, 2020 11:40

giuseppe self-requested a review June 30, 2020 10:48

giuseppe reviewed Jun 30, 2020

View reviewed changes

siscia mentioned this pull request Aug 13, 2020

Add support for CernVM-FS graph driver #383

Closed

AkihiroSuda mentioned this pull request Oct 7, 2020

Support remote snapshotter for faster container startup #498

Closed

ktock mentioned this pull request Dec 9, 2020

pkg/compression: new zstd variant zstd:chunked containers/image#1084

Merged

ktock closed this Dec 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support remotely-mountable layers for speeding up image distribution #644

Support remotely-mountable layers for speeding up image distribution #644

ktock commented Jun 9, 2020 •

edited

Loading

rhatdan commented Jun 9, 2020

ktock commented Jun 17, 2020

rhatdan commented Jun 24, 2020

ktock commented Jun 25, 2020

rhatdan commented Jun 25, 2020

ktock commented Jun 29, 2020

giuseppe Jun 30, 2020

ktock Jun 30, 2020

rhatdan commented Aug 3, 2020

giuseppe commented Aug 4, 2020 •

edited

Loading

ktock commented Aug 25, 2020

siscia commented Oct 8, 2020

ktock commented Dec 26, 2020

Support remotely-mountable layers for speeding up image distribution #644

Support remotely-mountable layers for speeding up image distribution #644

Conversation

ktock commented Jun 9, 2020 • edited Loading

The following design description is stale. See #644 (comment) for the latest design of this patch.

Creating layers without ApplyDiff

Working example of the graphdriver which supports remotely-mountable layers

TODOs

rhatdan commented Jun 9, 2020

ktock commented Jun 17, 2020

Layer discovery for additional layer store

Working example of the graphdriver which supports remotely-mountable layers

rhatdan commented Jun 24, 2020

ktock commented Jun 25, 2020

rhatdan commented Jun 25, 2020

ktock commented Jun 29, 2020

giuseppe Jun 30, 2020

Choose a reason for hiding this comment

ktock Jun 30, 2020

Choose a reason for hiding this comment

rhatdan commented Aug 3, 2020

giuseppe commented Aug 4, 2020 • edited Loading

ktock commented Aug 25, 2020

siscia commented Oct 8, 2020

ktock commented Dec 26, 2020

ktock commented Jun 9, 2020 •

edited

Loading

giuseppe commented Aug 4, 2020 •

edited

Loading