-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable retry and recover for tarfs #538
base: main
Are you sure you want to change the base?
Conversation
c001e9e
to
d276eea
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #538 +/- ##
==========================================
+ Coverage 33.63% 33.97% +0.33%
==========================================
Files 65 68 +3
Lines 8259 9352 +1093
==========================================
+ Hits 2778 3177 +399
- Misses 5166 5817 +651
- Partials 315 358 +43
|
if lock { | ||
st.mutex.Lock() | ||
} | ||
st.mutex.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why lock this by default, Can we add lock and release it after calling getSnapshotStatus
?
Otherwise, change getSnapshotStatus
to getSnapshotStatusWithLock
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ensure atomicity by acquiring the lock on snapshotStatus
with the Manager.mutex" locked. So renamed
getSnapshotStatusto
getSnapshotStatusWithLock`.
} | ||
|
||
// AttachLoopDevice attaches a specified backing file to a loop device | ||
func AttachLoopDevice(backingFile string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it enough if we use it by importing "github.com/containerd/containerd/mount"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some modification is needed, so coopied it here.
The field `blobTarFilePath` is only used once, and it's easy to build it on demand. So remove it from struct snapshotStatus. Signed-off-by: Jiang Liu <[email protected]>
`waitLayerReady()` alreayds ensures the layer is in ready state, so avoid redundant status check. Signed-off-by: Jiang Liu <[email protected]>
Refactor tarfs::blobProcess() for coming changes. Signed-off-by: Jiang Liu <[email protected]>
Tarfs downloads and converts layer content on background, there's no flexible way to report failure from background tasks to containerd. So retry downloading and converting tasks when preparing rw layer for container. If it still fails, failure will be reported to the rw layer preparation request. Signed-off-by: Jiang Liu <[email protected]>
Avoid redundant merge operation for images with only one layer. Signed-off-by: Jiang Liu <[email protected]>
When nydus snapshotter restarts, we need to recover information related to tarfs. Otherwise all tarfs instance will become unusable. Signed-off-by: Jiang Liu <[email protected]>
Import losetup.go from containerd, to replace go-losetup. Signed-off-by: Jiang Liu <[email protected]>
Replace go-losetup with the version from containerd, to support auto-clean unused loop devices. Signed-off-by: Jiang Liu <[email protected]>
On startup, we need to recover information for all tarfs related snapshots, and remount EROFS filesystems. Signed-off-by: Jiang Liu <[email protected]>
8835297
to
bfc11e7
Compare
ea0b54c
to
8f31e8e
Compare
62c85f1
to
cc2f9f6
Compare
Add unit test cases for tarfs. Signed-off-by: Jiang Liu <[email protected]>
Fix a data race condition WARNING: DATA RACE Write at 0x00c000178428 by goroutine 27: github.com/containerd/nydus-snapshotter/pkg/remote/remotes/docker.(*httpReadSeeker).Close() /home/runner/work/nydus-snapshotter/nydus-snapshotter/pkg/remote/remotes/docker/httpreadseeker.go:87 +0x57 github.com/containerd/nydus-snapshotter/pkg/tarfs.(*Manager).blobProcess.func2.1() /home/runner/work/nydus-snapshotter/nydus-snapshotter/pkg/tarfs/tarfs.go:339 +0x48 runtime.deferreturn() /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/panic.go:476 +0x32 github.com/containerd/nydus-snapshotter/pkg/tarfs.(*Manager).blobProcess.func3() /home/runner/work/nydus-snapshotter/nydus-snapshotter/pkg/tarfs/tarfs.go:394 +0x71 Previous read at 0x00c000178428 by goroutine 40: github.com/containerd/nydus-snapshotter/pkg/remote/remotes/docker.(*httpReadSeeker).Read() /home/runner/work/nydus-snapshotter/nydus-snapshotter/pkg/remote/remotes/docker/httpreadseeker.go:48 +0x68 bufio.(*Reader).Read() /opt/hostedtoolcache/go/1.20.1/x64/src/bufio/bufio.go:223 +0x2c3 github.com/containerd/containerd/archive/compression.(*bufferedReader).Read() /home/runner/go/pkg/mod/github.com/containerd/[email protected]/archive/compression/compression.go:113 +0xa4 io.copyBuffer() /opt/hostedtoolcache/go/1.20.1/x64/src/io/io.go:427 +0x28d io.Copy() /opt/hostedtoolcache/go/1.20.1/x64/src/io/io.go:386 +0x88 os.genericReadFrom() /opt/hostedtoolcache/go/1.20.1/x64/src/os/file.go:161 +0x34 os.(*File).ReadFrom() /opt/hostedtoolcache/go/1.20.1/x64/src/os/file.go:155 +0x324 io.copyBuffer() /opt/hostedtoolcache/go/1.20.1/x64/src/io/io.go:413 +0x1c5 io.Copy() /opt/hostedtoolcache/go/1.20.1/x64/src/io/io.go:386 +0x84 os/exec.(*Cmd).childStdin.func1() /opt/hostedtoolcache/go/1.20.1/x64/src/os/exec/exec.go:511 +0x45 os/exec.(*Cmd).Start.func2() /opt/hostedtoolcache/go/1.20.1/x64/src/os/exec/exec.go:717 +0x42 os/exec.(*Cmd).Start.func3() /opt/hostedtoolcache/go/1.20.1/x64/src/os/exec/exec.go:729 +0x47 Goroutine 27 (running) created at: github.com/containerd/nydus-snapshotter/pkg/tarfs.(*Manager).blobProcess() /home/runner/work/nydus-snapshotter/nydus-snapshotter/pkg/tarfs/tarfs.go:393 +0x9dd github.com/containerd/nydus-snapshotter/pkg/tarfs.(*Manager).PrepareLayer() /home/runner/work/nydus-snapshotter/nydus-snapshotter/pkg/tarfs/tarfs.go:465 +0x444 github.com/containerd/nydus-snapshotter/pkg/tarfs.TestPrepareLayer() /home/runner/work/nydus-snapshotter/nydus-snapshotter/pkg/tarfs/tarfs_test.go:33 +0x188 testing.tRunner() /opt/hostedtoolcache/go/1.20.1/x64/src/testing/testing.go:1576 +0x216 testing.(*T).Run.func1() /opt/hostedtoolcache/go/1.20.1/x64/src/testing/testing.go:1629 +0x47 Goroutine 40 (finished) created at: os/exec.(*Cmd).Start() /opt/hostedtoolcache/go/1.20.1/x64/src/os/exec/exec.go:716 +0xf8e github.com/containerd/containerd/archive/compression.cmdStream() /home/runner/go/pkg/mod/github.com/containerd/[email protected]/archive/compression/compression.go:284 +0x36f github.com/containerd/containerd/archive/compression.gzipDecompress() /home/runner/go/pkg/mod/github.com/containerd/[email protected]/archive/compression/compression.go:272 +0x152 github.com/containerd/containerd/archive/compression.DecompressStream() /home/runner/go/pkg/mod/github.com/containerd/[email protected]/archive/compression/compression.go:203 +0x3e4 github.com/containerd/nydus-snapshotter/pkg/tarfs.(*Manager).blobProcess.func2() /home/runner/work/nydus-snapshotter/nydus-snapshotter/pkg/tarfs/tarfs.go:341 +0x1b1 github.com/containerd/nydus-snapshotter/pkg/tarfs.(*Manager).blobProcess.func3() /home/runner/work/nydus-snapshotter/nydus-snapshotter/pkg/tarfs/tarfs.go:394 +0x71 ================== testing.go:1446: race detected during execution of test Signed-off-by: Jiang Liu <[email protected]>
cc2f9f6
to
e0124e7
Compare
- name: Setup Nydus | ||
run: | | ||
# Download nydus components | ||
NYDUS_VER=v$(curl --header 'authorization: Bearer ${{ secrets.GITHUB_TOKEN }}' -s "https://api.github.com/repos/dragonflyoss/nydus/releases/latest" | jq -r .tag_name | sed 's/^v//') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No auth is required when requesting the latest version.
// Download and convert layer content in background. | ||
// Will retry when the content is actually needed if the background process failed. | ||
go func() { | ||
_ = process(rc, remote) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should also give a warning log if an error happens.
Can we add some e2e test cases for tarfs in https://github.com/containerd/nydus-snapshotter/blob/main/integration/entrypoint.sh? |
Enhance tarfs implementation by: