Support remotely-mountable layers for speeding up image distribution #956
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related:
The following design description is stale. See #956 (comment) for the latest design of this patch.
Recently I read through the codes and came up with the strategy for enabling remote-snapshotter-like functionality in libpod.
The low-level basic idea is that graphdriver manages "remotely-mountable" layers as well as the contents pulled by libpod. When it comes to higher-level components including
containers/image
, some of the key factors should be:The following is the big pictures of the design. I don't think it's perfect and we might need further discussion so please tell me anything I'm missing. This is also based on the stargz requirements so I would like to get opinions from CVMFS people as well.
For lower-level part of the design, please refer to containers/storage#644.
Propagating some additional layer information
Including stargz, remotely mounting layers will require some information including image reference, in addition to the layer digest. My current design enables this using
BlobInfo.Annotations
which is passed fromcopy.Image
API to theImageDestination
.copy.Image
appends some image-related information (e.g. image reference, etc.) to the targeting layer'sBlobInfo.Annotations
and pass it to theImageDestination.TryReusingBlob
API for asking if this can skip the layer download.Checking if the layer download is skippable by talking with
storage.Store
We are now focusing on remote mountpoint management based on graphdriver so one of the implementations of
ImageDestination
we need to focus on here isstorageImageDestination
which is based oncontainers/storage
.storageImageDestination.TryReusingBlob
checks if the targeting layer is being stored in the backing store. This patch extends this for remotely-mountable layers usingStore.CreateLayer
API. I added a new optional fieldstorage.LayerOptions.Labels
. During checking the layer existence,storageImageDestination
callsStore.CreateLayer
API for asking the existence of the targeting layer ((id, parent) = (target layer digest, "")
) withLabels
option which contains the information passed fromcopy.Image
throughBlobInfo.Annotations
.Store implementation can use these lables for searching the targeting layer. If it exists, the store tells
storageImageDestination
to skip downloading this layer. In my current implementation, I introduced a typed errorErrTargetLayerAlreadyExist
for this.Committing the layer chain without diffing by talking with
storage.Store
Calling
Store.Diff
API is another thing we want to avoid for remotely-mountable layers because it possibly cause downloading the whole blob in the store.When committing the layers chain in
storageImageDestination.Commit
API, it generally gets the targeting layer contents by callingStore.Diff
API towards(id, parent) = (target layer digest, "")
, which we want to avoid for reomtely-mountable layeres. This patch enablesstorageImageDestination
to omit the diffing by leveraging the same semantics ofStore.CreateLayer
API as mentioned in the above.storageImageDestination
callsStore.CreateLayer
with arg(id, parent) = (chain id, the parent layer)
with the annotations(labels) used duringTryReusingBlob
. Then this expects the backing store internally overlays the remotely mounted layers. If it succeeds, the store tellsstorageImageDestination
to skip diffing this layer. Even if the overlaying fails, we can fallback to the normal steps (diffing and applying) because inTryReusingBlobs
it's been made sure that the chain(id, parent) = (target layer digest, "")
exists in the backing store.TODOs