Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure when doing large extract operation #172

Open
johnp789 opened this issue Nov 14, 2018 · 2 comments
Open

Failure when doing large extract operation #172

johnp789 opened this issue Nov 14, 2018 · 2 comments
Labels

Comments

@johnp789
Copy link

I'm hosting a 592 MB chunk store on a remote machine using Python 3.7.1's http.server. A casync extract on a Raspberry Pi 3 fails with a broken pipe error.

$ uname -a
Linux alarmpi 4.14.78-2-ARCH #1 SMP Sun Oct 28 17:05:07 UTC 2018 armv7l GNU/Linux
$ sudo strace -ff -o /tmp/trace casync extract -v --store=http://pedro:8000/default.castr /tmp/usr.caidx /mnt/home/john/usr
[...]
Acquiring http://pedro:8000/default.castr/05f6/05f6b0f31d8f3d0f209dfadceff29655e7ccde2edeadaa3201df660ed1262180.cacnk...
Acquiring http://pedro:8000/default.castr/c83a/c83af52286b19cb26e89af2fbfa677dc360214ab0e9d2feea13f11b4a7d13461.cacnk...
Extracted bin/mke2fs
Extracting bin/mkfifo
Extracted bin/mkfifo
Extracting bin/mkfs
Extracted bin/mkfs
Extracting bin/mkfs.bfs
Extracted bin/mkfs.bfs
Extracting bin/mkfs.cramfs
Acquiring http://pedro:8000/default.castr/c571/c571bf8f795d5e769e34866e73ce399d7c026e7cc605c7534233ae60ee2d6e79.cacnk...
Extracted bin/mkfs.cramfs
Extracting bin/mkfs.ext2
Extracted bin/mkfs.ext2
Extracting bin/mkfs.ext3
Failed to acquire http://pedro:8000/default.castr/c571/c571bf8f795d5e769e34866e73ce399d7c026e7cc605c7534233ae60ee2d6e79.cacnk
Failed to run synchronizer: Broken pipe

Some of the strace output:

rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, NULL, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, NULL, 8) = 0
poll(NULL, 0, 49)                       = 0 (Timeout)
rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, NULL, 8) = 0
socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP) = 4
setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
fcntl64(4, F_GETFL)                     = 0x2 (flags O_RDWR)
fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
connect(4, {sa_family=AF_INET6, sin6_port=htons(8000), inet_pton(AF_INET6, "fe80::4ecc:6aff:fe0a:8fc4", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=if_nametoindex("eth0")}, 2
8) = -1 EINPROGRESS (Operation now in progress)
poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 0) = 1 ([{fd=4, revents=POLLOUT|POLLWRNORM|POLLERR|POLLHUP}])
getsockopt(4, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
close(4)                                = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, NULL, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, NULL, 8) = 0
write(2, "Failed to acquire http://pedro:8"..., 126) = 126
rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, NULL, 8) = 0
rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, NULL, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, NULL, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x76b3be00}, NULL, 8) = 0
munmap(0x76639000, 200704)              = 0
openat(AT_FDCWD, "/var/tmp/1784b44719c08e1c.carem", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW|O_NOATIME|O_CLOEXEC|O_DIRECTORY) = 4

After that fails, trying to run the same command again gets a bit farther, but then fails with a different error message.

[...]
Seeded bin/mkfs.cramfs
Seeding bin/mkfs.ext2
Seeded bin/mkfs.ext2
Seeded bin
Seeded ./
Acquiring http://pedro:8000/default.castr/b357/b357a4e26a891c027c64be4a85aad150e4fb17fa7aa04a89b62687c9a1a581d7.cacnk...
Extracting ./
Using feature flags: 32bit-uids user-names nsec-time symlinks device-nodes fifos sockets flag-hidden flag-system flag-archive flag-append flag-noatime flag-compr flag-nocow flag-dirsync flag-immutable flag-sync flag-nocomp flag-projinherit subvolume subvolume-ro xattrs acl fcaps quota-projid
Excluding files and directories with chattr(1) -d flag: yes
Excluding submounts: no
Excluding files and directories listed in .caexclude: yes
Digest algorithm: sha512-256
Extracting bin
Extracting bin/2to3
Extracted bin/2to3
Extracting bin/2to3-3.7
Extracted bin/2to3-3.7
Extracting bin/[
Extracted bin/[
Extracting bin/accessdb
Acquiring http://pedro:8000/default.castr/e7cb/e7cbec22f9dd7758b96874060e16438530da61c5dd883458c38fdab87fadb0b4.cacnk...
Acquiring http://pedro:8000/default.castr/b7d8/b7d8b7fc64fd6973e8edd14d60b93b41ff016bf4c001cfc414be14e7b523d29b.cacnk...
\Acquiring http://pedro:8000/default.castr/f6af/f6afe383133395f55d620217b07dd8a5624adaa9f730c1470703f10ba432ab6a.cacnk...
Acquiring http://pedro:8000/default.castr/4285/4285e0c353df2cb6a462d4446672529f0d6259a2164551971fabdaf2c8744320.cacnk...
Extracted bin/accessdb
Extracting bin/aclocal
Extracted bin/aclocal
[...]
Acquiring http://pedro:8000/default.castr/df65/df65131c27f7aee9bab782ec27867e456364e5fa2ce3b9b75bb12460ae1a19c9.cacnk...
Extracted bin/autom4te
Extracting bin/automake
Acquiring http://pedro:8000/default.castr/1a40/1a405edafad73d56f9d90d3a744b861a543fd391e0cfcf12aa38a39ef335519a.cacnk...
Extracted bin/automake
Extracting bin/automake-1.16
Extracted bin/automake-1.16
Failed to run synchronizer: Stale file handle

Some of that strace output:

unlinkat(9, "low-priority/23", 0)       = 0
symlinkat("high-priority/23", 9, "chunks/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45") = 0
symlinkat("ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45", 9, "high-priority/23") = 0
readlinkat(6, "ec7d/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45", 0x20d2788, 255) = -1 ENOENT (No such file or directory)
openat(9, "ec7d/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45", O_RDONLY|O_NOCTTY|O_LARGEFILE|O_NOFOLLOW|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(9, "ec7d/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45.cacnk", O_RDONLY|O_NOCTTY|O_LARGEFILE|O_NOFOLLOW|O_CLOEXEC) = -1 ENOENT (No such file or directory)
readlinkat(9, "chunks/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45", "high-priority/23", 255) = 16
fstatat64(9, "ec7d/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45", 0x7efa66a8, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
fstatat64(9, "ec7d/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45.cacnk", 0x7efa66b0, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
getrandom("\x50\xee\xe8\x6c\x3a\x00\xb9\x9b", 8, GRND_NONBLOCK) = 8
mkdirat(9, "ec7d", 0777)                = 0
openat(9, "ec7d/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45.9bb9003a6ce8ee50.tmp", O_WRONLY|O_CREAT|O_EXCL|O_NOCTTY|O_LARGEFILE|O_CLOEXEC, 0444) = 14
write(14, "(\265/\375\0P\325K\2\252\212NvU\360Lh\233\08&\264~\32\207\227\5O_2\327\350"..., 18819) = 18819
close(14)                               = 0
renameat2(9, "ec7d/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45.9bb9003a6ce8ee50.tmp", 9, "ec7d/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45.
cacnk", RENAME_NOREPLACE) = 0
readlinkat(6, "ec7d/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45", 0x20d2788, 255) = -1 ENOENT (No such file or directory)
openat(9, "ec7d/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45", O_RDONLY|O_NOCTTY|O_LARGEFILE|O_NOFOLLOW|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(9, "ec7d/ec7d392665e3cf17de47f4d469946da30416d9a020fee11786c15f276bb88a45.cacnk", O_RDONLY|O_NOCTTY|O_LARGEFILE|O_NOFOLLOW|O_CLOEXEC) = 14
read(14, "(\265/\375\0P\325K\2\252\212NvU\360Lh\233\08&\264~\32\207\227\5O_2\327\350"..., 65536) = 18819
mmap2(NULL, 1314816, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x763bf000
read(14, "", 65536)                     = 0
munmap(0x763bf000, 1314816)             = 0
close(14)                               = 0
write(12, "ned.\n\t      elsif (! defined $co"..., 32067) = 32067
write(2, "Extracted bin/automake-1.16\n", 28) = 28
openat(4, "bin/automake", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW|O_CLOEXEC) = 14
fstat64(14, {st_mode=S_IFREG|0755, st_size=257949, ...}) = 0
close(14)                               = 0
write(2, "Failed to run synchronizer: Stal"..., 46) = 46
close(3)                                = 0
close(11)                               = 0

Is there a limit I'm running up against here? If so, is there a workaround?

@johnp789
Copy link
Author

Some experiments extracting the same chunk store on an x86_64 machine with plenty of memory show that the extraction requires more memory than the Raspberry Pi 3 has. Extracting on the Pi might be running into some other problem, however, given that the error message is different.

$ sudo systemd-run -t -p MemoryLimit=500M casync --store=http://pedro:8000/default.castr extract http://pedro:8000/usr.caidx /tmp/usr_test
Running as unit: run-u371.service
Press ^] three times within 1s to disconnect TTY.
Failed to run synchronizer: Cannot allocate memory
$ sudo rm -rf usr_test/
$ sudo systemd-run -t -p MemoryLimit=1500M casync --store=http://pedro:8000/default.castr extract http://pedro:8000/usr.caidx /tmp/usr_test
Running as unit: run-u380.service
Press ^] three times within 1s to disconnect TTY.
Failed to run synchronizer: Cannot allocate memory
$ sudo rm -rf usr_test/
$ sudo systemd-run -t -p MemoryLimit=3000M casync --store=http://pedro:8000/default.castr extract http://pedro:8000/usr.caidx /tmp/usr_test
Running as unit: run-u392.service
Press ^] three times within 1s to disconnect TTY.

@poettering poettering added the bug label Feb 12, 2019
@srd424
Copy link
Contributor

srd424 commented Jun 14, 2020

Second error here might be same as #240 .. was it running on a reflink-capable filesystem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants