Skip to content

Commit

Permalink
Merge pull request #96 from ktock/ovv-doc
Browse files Browse the repository at this point in the history
Reorganize overview doc
  • Loading branch information
ktock authored May 13, 2020
2 parents 90088b6 + cd49742 commit fff955f
Show file tree
Hide file tree
Showing 3 changed files with 135 additions and 54 deletions.
29 changes: 5 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ Stargz Snapshotter is a **non-core** sub-project of containerd.
For using stargz snapshotter on kubernetes nodes, you need the following configuration to containerd as well as run stargz snapshotter daemon on the node. We assume that you are using containerd newer than at least [commit `d8506bf`](https://github.com/containerd/containerd/commit/d8506bfd7b407dcb346149bcec3ed3c19244e3f1) as a CRI runtime.

```toml
version = 2

# Plug stargz snapshotter into containerd
# Containerd recognizes stargz snapshotter through specified socket address.
# The specified address below is the default which stargz snapshotter listen to.
Expand Down Expand Up @@ -72,6 +74,9 @@ $ curl 127.0.0.1:8080
Hello World!
```

Stargz snapshotter also supports further configuration including private registry authentication, mirror registries, etc.
For more details, refer to the [overview doc](./docs/overview.md).

## Creating stargz images and further optimization

For lazy pulling images, you need to prepare stargz images first. You can use [CRFS-official `stargzify`](https://github.com/google/crfs/tree/master/stargz/stargzify) command or our `ctr-remote` command which has further optimization functionality. You can also try our pre-converted images listed in [this doc](./docs/pre-converted-images.md). For more details about stargz and the optimization, refer to [this doc](./docs/stargz-estargz.md)
Expand Down Expand Up @@ -107,30 +112,6 @@ root@8dab301bd68d:/# ls
bin boot dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
```

## Authentication

We support the following methods for private repository authentication.
- Using `DOCKER_CONFIG` or `~/.docker/config.json`
- Using Kubernetes secrets (type = `kubernetes.io/dockerconfigjson`)

Following example enables stargz snapshotter to access to private registries using `docker login` command. Stargz snapshotter gets credentials from `DOCKER_CONFIG`(or `~/.docker/config.json`).

```console
# docker login
(Enter username and password)
# ctr-remote image rpull --user <username>:<password> docker.io/<your-repository>/ubuntu:18.04
```

Following configuration enables stargz snapshotter to access to private registries using kubernetes secrets (type = `kubernetes.io/dockerconfigjson`) in the cluster using kubeconfig files. You can specify the path of kubeconfig file to use with `kubeconfig_path` option. It's no problem that the specified file doesn't exist when this snapshotter starts. In this case, snapsohtter polls the file until actually provided. This is useful for some environments (e.g. single node cluster with containerized apiserver) where stargz snapshotter needs to start before everything, including booting containerd/kubelet/apiserver and configuring users/roles. If no `kubeconfig_path` is specified, snapshotter searches kubeconfig files from `KUBECONFIG` or `~/.kube/config`.

```toml
[kubeconfig_keychain]
enable_keychain = true
kubeconfig_path = "/etc/kubernetes/snapshotter/config.conf"
```

We don't share credentials with containerd so credentials specified by ctr's `--user` option in the above example is just for containerd's side. If you have no right to access to the repository with credentials specified to stargz snapshotter, pull operations fall back to the normal one(i.e. overlayfs).

## Project details

Stargz Snapshotter is a containerd **non-core** sub-project, licensed under the [Apache 2.0 license](./LICENSE).
Expand Down
Binary file modified docs/images/overview01.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
160 changes: 130 additions & 30 deletions docs/overview.md
Original file line number Diff line number Diff line change
@@ -1,59 +1,159 @@
# Containerd Stargz Snapshotter Overview

__Before read through this overview document, we recommend you to try Demo in [README](README.md) to make sure this snapshotter's functionality.__
__Before get through this overview document, we recommend you to read [README](README.md).__

Pulling image is one of the time-consuming steps in the container startup process.
In containerd community, we have had a lot of discussions to address this issue as follows,
In containerd community, we have had a lot of discussions to address this issue as the following,

- [#3731 Support remote snapshotter to speed up image pulling](https://github.com/containerd/containerd/issues/3731)
- [#2968 Support `Prepare` for existing snapshots in Snapshotter interface](https://github.com/containerd/containerd/issues/2968)
- [#2943 remote filesystem snapshotter](https://github.com/containerd/containerd/issues/2943)

The solution for the fast image distribution is called *Remote Snapshotter* in containerd community.
This creates container's rootfs layers by directly mounting from remote stores, which is much faster than downloading and unpacking the whole image contents.
The solution for the fast image distribution is called *Remote Snapshotter* plugin.
This prepares container's rootfs layers by directly mounting from remote stores instead of downloading and unpacking the entire image contents.
The actual image contents can be fetched *lazily* so runtimes can startup containers before the entire image contents to be locally available.
We call these remotely mounted layers as *remote snapshots*.

*Stargz Snapshotter* is a remote snapshotter plugin implementation which supports standard compatible remote snapshots functionality.
The image format that achieves it is _stargz_ by [CRFS](https://github.com/google/crfs).
Stargz format is backwards-compatible to container standards so you can push stargz-formatted images to container registries and run them using container runtimes including Docker.
When you run a container image and it is formatted as stargz image, Stargz Snapshotter automatically prepares container's rootfs layers as remote snapshots.
As an image converter command, you can use CRFS-official `stargzify` or our `ctr-remote` which has additional optimization functionality.
This leverages [*stargz* image format by Google](https://github.com/google/crfs) which enables lazy distribution but is backwards-compatible with container standards.
When you run a container image and it is formatted by stargz, stargz snapshotter prepares container's rootfs layers as remote snapshots by mounting layers from [OCI](https://github.com/opencontainers/distribution-spec)/[Docker](https://docs.docker.com/registry/spec/api/) standard registries to the node, instead of pulling the entire image contents.

This document gives you a high-level overview of Stargz Snapshotter.
This document gives you a high-level overview of stargz snapshotter.

![overview](/docs/images/overview01.png)

## Stargz Snapshotter Proxy Plugin
## Stargz Snapshotter proxy plugin

Stargz Snapshotter is implemented as a proxy plugin of containerd.
The daemon binary is named `containerd-stargz-grpc`.
Because it runs as a standalone daemon process, you can package all dependencies of Stargz Snapshotters and filesystems into one container and deploy it on each node.
For more information of containerization, see docker-compose file in this repo.
Stargz snapshotter is implemented as a [proxy plugin](https://github.com/containerd/containerd/blob/04985039cede6aafbb7dfb3206c9c4d04e2f924d/PLUGINS.md#proxy-plugins) daemon (`containerd-stargz-grpc`) for containerd.
When containerd starts a container, it queries the rootfs snapshots to stargz snapshotter daemon through an unix socket.
This snapshotter remotely mounts queried stargz layers from registries to the node and provides these mount points as remote snapshots to containerd.

## CRFS Stargz Image Format
Containerd recognizes this plugin through an unix socket specified in the configuration file (e.g. `/etc/containerd/config.toml`).
Stargz snapshotter can also be used through Kubernetes CRI by specifying the snapshotter name in the CRI plugin configuration.
We assume that you are using containerd newer than at least [commit `d8506bf`](https://github.com/containerd/containerd/commit/d8506bfd7b407dcb346149bcec3ed3c19244e3f1)

Stargz Snapshotter supports stargz image format.
This format is backwards-compatible to container standards so you can manage formatted images in same ways as standard container images e.g. pushing to and pulling from container registries, running it with Docker, etc.
When you run a stargz-formatted image, Stargz Snapshotter prepares container's rootfs layers as remote snapshots and actual file contents are fetched in chunk granularity on each access to each file.
You can also use `~/.docker/config.json`-based authentication for your private registries.
```toml
version = 2

Because file contents are fetched via NW on each access, read performance would be one of the major concerns.
To mitigate it, Stargz Snapshotter provides additional workload-oriented optimization.
When you convert an image to stargz format using `ctr-remote`, you can specify some options which describe your workload (i.e. entrypoint commands, environment variables, etc.).
For example, we can optimize `ubuntu:18.04` image for execution of `ls` command on `bash` as following,
# Plug stargz snapshotter into containerd
# Containerd recognizes stargz snapshotter through specified socket address.
# The specified address below is the default which stargz snapshotter listen to.
[proxy_plugins]
[proxy_plugins.stargz]
type = "snapshot"
address = "/run/containerd-stargz-grpc/containerd-stargz-grpc.sock"

# Use stargz snapshotter through CRI
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "stargz"
```
# ctr-remote image optimize --plain-http --entrypoint='[ "/bin/bash", "-c" ]' --args='[ "ls" ]' \
ubuntu:18.04 http://registry2:5000/ubuntu:18.04

This repo contains [a Dockerfile as a KinD node image](/Dockerfile) which includes the above configuration.

## State directory

Stargz snapshotter mounts stargz layers from registries to the node using FUSE.
The all files metadata in the image are preserved on the filesystem and files contents are fetched from registries on demand.

At the root of the filesystem, there is a *state directory* (`/.stargz-snapshotter`) for status monitoring for the filesystem.
This directory is hidden from `getdents(2)` so you can't see this with `ls -a /`.
Instead, you can directly access the directory by specifying the path (`/.stargz-snapshotter`).

State directory contains JSON-formatted metadata files for each layer.
In the following example, metadata JSON files for overlayed 7 layers are visible.
In each metadata JSON file, the following fields are contained,

- `digest` contains the layer digest. This is the same value as that in the image's manifest.
- `size` is the size bytes of the layer.
- `fetchedSize` and `fetchedPercent` indicate how many bytes have been fetched for this layer. Stargz snapshotter aggressively downloads this layer in the background so these values gradually increase. When `fetchedPercent` reaches to `100` percents, this layer has been fully downloaded on the node and no further access will occur for reading files.

Note that the state directory layout and the metadata JSON structure are subject to change.

```console
# ctr-remote run --rm -t --snapshotter=stargz docker.io/stargz/golang:1.12.9-esgz test /bin/bash
root@1d43741b8d29:/go# ls -a /
. bin dev go lib media opt root sbin sys usr
.. boot etc home lib64 mnt proc run srv tmp var
root@1d43741b8d29:/go# ls /.stargz-snapshotter/*
/.stargz-snapshotter/sha256:2b1fc65cafe05b65acc9e9f186df4dd81ae74c58ef73d89ecfc15e7286b3e960.json
/.stargz-snapshotter/sha256:42d56485c1f672e394a02855048774621731c8fd44a54dc816a421a3a52b8482.json
/.stargz-snapshotter/sha256:6a5826d877de5c93fb4a9e1d0369cfdef6d43df2610562501ebf42e4bcb2ef73.json
/.stargz-snapshotter/sha256:a4d35801573274df19d9c2ae2aed80eba96d5aa69a38c464e1f01f9abf81e34e.json
/.stargz-snapshotter/sha256:ab13100112faac6e04d2da2281db3df942efc8cef2532ab2cac688c6232944d8.json
/.stargz-snapshotter/sha256:e8cc31024eb09fe216ad906392aec139038330c6d29dfd3fe5c81c4b2dd21430.json
/.stargz-snapshotter/sha256:f077511be7d385c17ba88980379c5cd0aab7068844dffa7a1cefbf68cc3daea3.json
root@1d43741b8d29:/go# cat /.stargz-snapshotter/*
{"digest":"sha256:2b1fc65cafe05b65acc9e9f186df4dd81ae74c58ef73d89ecfc15e7286b3e960","size":131339690,"fetchedSize":7939690,"fetchedPercent":6.045156646859757}
{"digest":"sha256:42d56485c1f672e394a02855048774621731c8fd44a54dc816a421a3a52b8482","size":10047608,"fetchedSize":2047608,"fetchedPercent":20.379059374131632}
{"digest":"sha256:6a5826d877de5c93fb4a9e1d0369cfdef6d43df2610562501ebf42e4bcb2ef73","size":54352828,"fetchedSize":2302828,"fetchedPercent":4.236813584014432}
{"digest":"sha256:a4d35801573274df19d9c2ae2aed80eba96d5aa69a38c464e1f01f9abf81e34e","size":70359295,"fetchedSize":2259295,"fetchedPercent":3.211082487395588}
{"digest":"sha256:ab13100112faac6e04d2da2281db3df942efc8cef2532ab2cac688c6232944d8","size":7890588,"fetchedSize":2140588,"fetchedPercent":27.12837116828302}
{"digest":"sha256:e8cc31024eb09fe216ad906392aec139038330c6d29dfd3fe5c81c4b2dd21430","size":52934435,"fetchedSize":2634435,"fetchedPercent":4.976788738748227}
{"digest":"sha256:f077511be7d385c17ba88980379c5cd0aab7068844dffa7a1cefbf68cc3daea3","size":580,"fetchedSize":580,"fetchedPercent":100}
```

## Registry-related configuration

You can configure stargz snapshotter for accessing registries with custom configurations.
The config file must be formatted with TOML and can be passed to stargz snapshotter with `--config` option.

### Authentication

Stargz snapshotter doesn't share private registries creds with containerd.
Instead, this supports authentication in the following methods,

- Using `$DOCKER_CONFIG` or `~/.docker/config.json`
- Using Kubernetes secrets (type = `kubernetes.io/dockerconfigjson`)

By default, This snapshotter tries to get creds from `$DOCKER_CONFIG` or `~/.docker/config.json`.
Following example enables stargz snapshotter to access to private registries using `docker login` command.
Stargz snapshotter doesn't share credentials with containerd so credentials specified by `ctr-remote`'s `--user` option in the example is just for containerd.

```console
# docker login
(Enter username and password)
# ctr-remote image rpull --user <username>:<password> docker.io/<your-repository>/ubuntu:18.04
```

Following configuration enables stargz snapshotter to access to private registries using kubernetes secrets (type = `kubernetes.io/dockerconfigjson`) in the cluster using kubeconfig files.
You can specify the path of kubeconfig file using `kubeconfig_path` option.
It's no problem that the specified file doesn't exist when this snapshotter starts.
In this case, snapsohtter polls the file until actually provided.
This is useful for some environments (e.g. single node cluster with containerized apiserver) where stargz snapshotter needs to start before everything, including booting containerd/kubelet/apiserver and configuring users/roles.
If no `kubeconfig_path` is specified, snapshotter searches kubeconfig files from `$KUBECONFIG` or `~/.kube/config`.

```toml
# Use Kubernetes secrets accessible by the kubeconfig `/etc/kubernetes/snapshotter/config.conf`.
[kubeconfig_keychain]
enable_keychain = true
kubeconfig_path = "/etc/kubernetes/snapshotter/config.conf"
```

The config file can be passed to stargz snapshotter using `containerd-stargz-grpc`'s `--config` option.

### Registry mirrors and insecure connection

You can also configure mirrored registries and insecure connection.
The hostname used as a mirror host can be specified using `host` option.
If an optional field `insecure` is `true`, snapshotter tries to connect to the registry using plain HTTP instead of HTTPS.

```toml
# Use `mirrorhost.io` as a mirrored host of `exampleregistry.io` and
# use plain HTTP for connecting to the mirror host.
[[resolver.host."exampleregistry.io".mirrors]]
host = "mirrorhost.io"
insecure = "true"

# Use plain HTTP for connecting to `exampleregistry.io`.
[[resolver.host."exampleregistry.io".mirrors]]
host = "exampleregistry.io"
insecure = "true"
```

Then it runs the workload in an isolated environment, monitor all file events, and marks accessed files, which are also likely accessed in your production environment too).
When Stargz Snapshotter prepares the rootfs, it prefetches and caches marked files.
This prefetch ends soon in most cases because the total size of marked files is commonly much smaller than the entire image (Docker Slim is one of the well-known optimization tools which leverages this property).
Eventually, you will read most contents from the cache during runtime, which leads to much better read performance.
The config file can be passed to stargz snapshotter using `containerd-stargz-grpc`'s `--config` option.

## Make your remote snapshotter

It is easy for you to implement your remote snapshotter using [our general snapshotter package](/snapshot) without considering the protocol between that and containerd.
It isn't difficult for you to implement your remote snapshotter using [our general snapshotter package](/snapshot) without considering the protocol between that and containerd.
You can configure the remote snapshotter with your `FileSystem` structure which you want to use as a backend filesystem.
[Our snapshotter command](/cmd/containerd-stargz-grpc/main.go) is a good example for the integration.

0 comments on commit fff955f

Please sign in to comment.