-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.8x slower compared to go-fuse #78
Comments
Check out distr1/distri#59 for a list of known performance improvements. In particular, I think torvalds/linux@5da784c will help with throughput, but I haven’t had the time/motivation to implement it yet. Is throughput important for your use-case? Would you be interested in contributing it? |
@stapelberg, thanks for feedback. Both throughput and latency are important for my case. The max-pages commit (torvalds/linux@5da784c) is indeed very related, but for now I was benchmarking both go-fuse and jacobsa/fuse without it - on Debian10 4.19 linux kernel where it is not yet present and not yet backported. In other words with max-pages both go-fuse and jacobsa will benefit from raising read buffer from 128K to e.g. 1M, likely preserving, at least qualitatively, performance comparision to each other. I looked through the list of other planned performance improvements, and none of the seem to be related to my simple benchmark. In other words, what I'm seeing, is that somehow jacobsa/fuse implements basic FUSE requests receive/reply and similar core FUSE protocol operations slower than go-fuse(*). Is that indeed the case? I, of course, am ready to contribute improvements, like I already did with go-fuse and the kernel where it was needed for my filesystem. However before contributing I need to better understand which library to build on. jacobsa/fuse API is somewhat cleaner, but (imho) the major speed difference neglects this argument. Thanks again for feedback, P.S. good idea with disti. qemu-runlinux might help you debug why bdi misestimates write throughput you are seeing in (*) I comment inline about every planned improvement and show that it was an equal comparision in between go-fuse and jacobsa/fuse:
Yes, go-fuse has it enabled (2), but the test benchmark is very simple program that reads from only single file. In other words FUSE_PARALLEL_DIROPS should be unrelated to this benchmark.
Judging from distr1/distri@22697ba2dab1 this seems to be more related to caching/invlidation directory entries. However, once again, the benchmark is only about one file and mainly issues many READ requests to that file.
Once again, unrelated to the benchmark.
Unrelated to this benchmark.
Yes, by default, whenever file on FUSE filesystem is opened, the kernel completely drops pagecache of it. It is possible to tell the kernel not to do so via returning FOPEN_KEEP_CACHE flag in response to OPEN request. Even with that, for filesystems that do not need to maintain any state on opened file handles, it is possible to save open round-trips by telling the kernel not to issue OPEN requests via
Should be unrelated to the benchmark, where only one file is used.
This should likely be a good thing to do, but neither go-fuse, nor jacobsa/fuse currently do it. |
Thanks for the thorough comparison.
I don’t know for sure, but one thing I noticed in passing is that there was a performance win in not spawning a goroutine per request in Lines 115 to 123 in 4ee1cf7
Thanks for the pointer! I already have a convenient way to start a distri image with a custom-built kernel and optionally kgdb, but that thread is about a separate, unrelated problem at work which I can’t reproduce in distri :) |
@stapelberg, thanks for feedback.
Yes, that might be very related in my experience. Go-fuse processes its requests this way, and as I already said this is exactly why I also tried to patch jacobsa/fuse to do it similarly https://lab.nexedi.com/kirr/jacobsa-fuse/commit/59a13571 but unfortunately without any change in observed performance.
You are welcome. |
Ok, I looked into FUSE protocol exchange that is happenning in between kernel and hanwen/jacobsa. First about
this turned out to be false due to in-memory filesystem now reporting Now let us see what is going on with test servers: jacobsa/fuse benchamarks at ~ 1.7 GB/s:
jacobsa/fuse fuse trace logfuse: Op 0x00000001 connection.go:408] <- init fuse: Op 0x00000001 connection.go:491] -> OK () fuse: Op 0x00000002 connection.go:408] <- GetInodeAttributes (inode 1) fuse: Op 0x00000002 connection.go:491] -> OK () fuse: Op 0x00000003 connection.go:408] <- LookUpInode (parent 1, name "hello") fuse: Op 0x00000003 connection.go:491] -> OK (inode 2) fuse: Op 0x00000004 connection.go:408] <- GetInodeAttributes (inode 2) fuse: Op 0x00000004 connection.go:491] -> OK () fuse: Op 0x00000005 connection.go:408] <- OpenFile (inode 2) fuse: Op 0x00000005 connection.go:491] -> OK () fuse: Op 0x00000006 connection.go:408] <- unknown (inode 2, opcode 40) fuse: Op 0x00000006 connection.go:493] -> Error: "function not implemented" fuse: Op 0x00000007 connection.go:408] <- ReadFile (inode 2, handle 0, offset 0, 131072 bytes) fuse: Op 0x00000007 connection.go:491] -> OK () fuse: Op 0x00000008 connection.go:408] <- ReadFile (inode 2, handle 0, offset 131072, 131072 bytes) fuse: Op 0x00000008 connection.go:491] -> OK () fuse: Op 0x00000009 connection.go:408] <- ReadFile (inode 2, handle 0, offset 262144, 131072 bytes) fuse: Op 0x00000009 connection.go:491] -> OK () fuse: Op 0x0000000a connection.go:408] <- ReadFile (inode 2, handle 0, offset 393216, 131072 bytes) fuse: Op 0x0000000a connection.go:491] -> OK () fuse: Op 0x0000000b connection.go:408] <- ReadFile (inode 2, handle 0, offset 524288, 131072 bytes) fuse: Op 0x0000000b connection.go:491] -> OK () fuse: Op 0x0000000c connection.go:408] <- ReadFile (inode 2, handle 0, offset 655360, 131072 bytes) fuse: Op 0x0000000c connection.go:491] -> OK () fuse: Op 0x0000000d connection.go:408] <- ReadFile (inode 2, handle 0, offset 786432, 131072 bytes) fuse: Op 0x0000000d connection.go:491] -> OK () fuse: Op 0x0000000e connection.go:408] <- ReadFile (inode 2, handle 0, offset 917504, 131072 bytes) fuse: Op 0x0000000e connection.go:491] -> OK () fuse: Op 0x0000000f connection.go:408] <- ReadFile (inode 2, handle 0, offset 1048576, 131072 bytes) fuse: Op 0x0000000f connection.go:491] -> OK () fuse: Op 0x00000010 connection.go:408] <- ReadFile (inode 2, handle 0, offset 1179648, 131072 bytes) fuse: Op 0x00000010 connection.go:491] -> OK () fuse: Op 0x00000011 connection.go:408] <- ReadFile (inode 2, handle 0, offset 1310720, 131072 bytes) fuse: Op 0x00000011 connection.go:491] -> OK () fuse: Op 0x00000012 connection.go:408] <- ReadFile (inode 2, handle 0, offset 1441792, 131072 bytes) fuse: Op 0x00000012 connection.go:491] -> OK () fuse: Op 0x00000013 connection.go:408] <- ReadFile (inode 2, handle 0, offset 1572864, 131072 bytes) fuse: Op 0x00000013 connection.go:491] -> OK () fuse: Op 0x00000014 connection.go:408] <- ReadFile (inode 2, handle 0, offset 1703936, 131072 bytes) fuse: Op 0x00000014 connection.go:491] -> OK () fuse: Op 0x00000015 connection.go:408] <- ReadFile (inode 2, handle 0, offset 1835008, 131072 bytes) fuse: Op 0x00000015 connection.go:491] -> OK () fuse: Op 0x00000016 connection.go:408] <- ReadFile (inode 2, handle 0, offset 1966080, 131072 bytes) ... fuse: Op 0x00000069 connection.go:408] <- ReadFile (inode 2, handle 0, offset 12845056, 131072 bytes) fuse: Op 0x00000069 connection.go:491] -> OK () fuse: Op 0x0000006a connection.go:408] <- ReadFile (inode 2, handle 0, offset 12976128, 131072 bytes) fuse: Op 0x0000006a connection.go:491] -> OK () fuse: Op 0x0000006b connection.go:408] <- GetInodeAttributes (inode 2) fuse: Op 0x0000006b connection.go:491] -> OK () fuse: Op 0x0000006c connection.go:408] <- FlushFile (inode 2) fuse: Op 0x0000006c connection.go:493] -> Error: "function not implemented" fuse: Op 0x0000006d connection.go:408] <- ReleaseFileHandle fuse: Op 0x0000006d connection.go:493] -> Error: "function not implemented" fuse: Op 0x0000006e connection.go:408] <- GetInodeAttributes (inode 1) fuse: Op 0x0000006e connection.go:491] -> OK () fuse: Op 0x0000006f connection.go:408] <- LookUpInode (parent 1, name "hello") fuse: Op 0x0000006f connection.go:491] -> OK (inode 2) fuse: Op 0x00000070 connection.go:408] <- GetInodeAttributes (inode 2) fuse: Op 0x00000070 connection.go:491] -> OK () fuse: Op 0x00000071 connection.go:408] <- OpenFile (inode 2) fuse: Op 0x00000071 connection.go:491] -> OK () fuse: Op 0x00000072 connection.go:408] <- ReadFile (inode 2, handle 0, offset 0, 131072 bytes) fuse: Op 0x00000072 connection.go:491] -> OK () fuse: Op 0x00000073 connection.go:408] <- ReadFile (inode 2, handle 0, offset 131072, 131072 bytes) fuse: Op 0x00000073 connection.go:491] -> OK () fuse: Op 0x00000074 connection.go:408] <- ReadFile (inode 2, handle 0, offset 262144, 131072 bytes) fuse: Op 0x00000074 connection.go:491] -> OK () fuse: Op 0x00000075 connection.go:408] <- ReadFile (inode 2, handle 0, offset 393216, 131072 bytes) fuse: Op 0x00000075 connection.go:491] -> OK () fuse: Op 0x00000076 connection.go:408] <- ReadFile (inode 2, handle 0, offset 524288, 131072 bytes) fuse: Op 0x00000076 connection.go:491] -> OK () fuse: Op 0x00000077 connection.go:408] <- ReadFile (inode 2, handle 0, offset 655360, 131072 bytes) fuse: Op 0x00000077 connection.go:491] -> OK () fuse: Op 0x00000078 connection.go:408] <- ReadFile (inode 2, handle 0, offset 786432, 131072 bytes) fuse: Op 0x00000078 connection.go:491] -> OK () ... The kernel issues 128K read requests to the file and open request once in a while after file was completely read. No unexpected requests here. hanwen/go-fuse benchmarks at 2.7 GB/s:
hanwen/go-fuse trace log2020/03/12 16:31:21 rx 1: INIT i0 {7.27 Ra 0x20000 ATOMIC_O_TRUNC,SPLICE_READ,IOCTL_DIR,WRITEBACK_CACHE,POSIX_ACL,ABORT_ERROR,DONT_MASK,SPLICE_WRITE,SPLICE_MOVE,FLOCK_LOCKS,READDIRPLUS,READDIRPLUS_AUTO,ASYNC_DIO,PARALLEL_DIROPS,ASYNC_READ,POSIX_LOCKS,EXPORT_SUPPORT,AUTO_INVAL_DATA,HANDLE_KILLPRIV,BIG_WRITES,NO_OPEN_SUPPORT} 2020/03/12 16:31:21 tx 1: OK, {7.27 Ra 0x20000 PARALLEL_DIROPS,ASYNC_READ,AUTO_INVAL_DATA,READDIRPLUS,BIG_WRITES,NO_OPEN_SUPPORT 0/0 Wr 0x10000 Tg 0x0} 2020/03/12 16:31:21 rx 2: LOOKUP i1 [".go-fuse-epoll-hack"] 20b 2020/03/12 16:31:21 tx 2: 2=no such file or directory, {i0 g0 tE=0s tA=0s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:21 rx 3: CREATE i1 {0100100 [WRONLY,CREAT,TRUNC,0x8000] (022)} [".go-fuse-epoll-hack"] 20b 2020/03/12 16:31:21 tx 3: OK, {i18446744073709551615 g0 {M0100644 SZ=0 L=1 0:0 B0*0 i0:18446744073709551615 A 0.000000 M 0.000000 C 0.000000} &{18446744073709551615 0 0}} 2020/03/12 16:31:21 rx 4: POLL i18446744073709551615 2020/03/12 16:31:21 tx 4: 38=function not implemented 2020/03/12 16:31:21 rx 5: FLUSH i18446744073709551615 {Fh 18446744073709551615} 2020/03/12 16:31:21 tx 5: 34=numerical result out of range 2020/03/12 16:31:21 rx 6: RELEASE i18446744073709551615 {Fh 18446744073709551615 WRONLY,0x8000 L0} 2020/03/12 16:31:21 tx 6: 34=numerical result out of range 2020/03/12 16:31:28 rx 7: LOOKUP i1 ["file.txt"] 9b 2020/03/12 16:31:28 tx 7: OK, {i2 g0 tE=0s tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 8: OPEN i2 {O_RDONLY,0x8000} 2020/03/12 16:31:28 tx 8: OK, {Fh 0 } 2020/03/12 16:31:28 rx 9: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 9: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 10: READ i2 {Fh 0 [0 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 10: OK, 131072b data "\x1f\x8b\b\x00\x81\x95C^"... 2020/03/12 16:31:28 rx 11: READ i2 {Fh 0 [131072 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 11: OK, 131072b data "\x8ex\xe5\xed\x9d\xed\xc1o"... 2020/03/12 16:31:28 rx 12: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 12: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 13: READ i2 {Fh 0 [262144 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 13: OK, 131072b data "0/\xe6\xc0\x8e\xe6Uy"... 2020/03/12 16:31:28 rx 14: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 14: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 15: READ i2 {Fh 0 [393216 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 15: OK, 131072b data "\xf5\xe3\xc2Έ\xcfx\x8d"... 2020/03/12 16:31:28 rx 16: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 16: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 17: READ i2 {Fh 0 [524288 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 17: OK, 131072b data "\"\x0fUQ\xe5D\xf7F"... 2020/03/12 16:31:28 rx 18: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 18: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 19: READ i2 {Fh 0 [655360 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 19: OK, 131072b data "-\b\x04E#\x8az\xc4"... 2020/03/12 16:31:28 rx 20: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 20: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 21: READ i2 {Fh 0 [786432 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 21: OK, 131072b data "\xff\xa3b\x12\xa2\xa4i3"... 2020/03/12 16:31:28 rx 22: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 22: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 23: READ i2 {Fh 0 [917504 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 23: OK, 131072b data "\x03_\x8f\xf4\xdcM\xed\xd9"... ... 2020/03/12 16:31:28 rx 204: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 204: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 205: READ i2 {Fh 0 [12845056 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 205: OK, 131072b data "\xd3\x05\xbd2\xf7\x19\xf0("... 2020/03/12 16:31:28 rx 206: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 206: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 207: READ i2 {Fh 0 [12976128 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 207: OK, 128523b data "\x96H!\x84\xa4\xb7\xae\xbb"... 2020/03/12 16:31:28 rx 208: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 208: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 209: FLUSH i2 {Fh 0} 2020/03/12 16:31:28 tx 209: OK 2020/03/12 16:31:28 rx 210: RELEASE i2 {Fh 0 NONBLOCK,0x8000 L0} 2020/03/12 16:31:28 tx 210: OK 2020/03/12 16:31:28 rx 211: LOOKUP i1 ["file.txt"] 9b 2020/03/12 16:31:28 tx 211: OK, {i2 g0 tE=0s tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 212: OPEN i2 {O_RDONLY,0x8000} 2020/03/12 16:31:28 tx 212: OK, {Fh 0 } 2020/03/12 16:31:28 rx 213: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 213: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 214: READ i2 {Fh 0 [0 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 214: OK, 131072b data "\x1f\x8b\b\x00\x81\x95C^"... 2020/03/12 16:31:28 rx 215: READ i2 {Fh 0 [131072 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 215: OK, 131072b data "\x8ex\xe5\xed\x9d\xed\xc1o"... 2020/03/12 16:31:28 rx 216: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 216: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 217: READ i2 {Fh 0 [262144 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 217: OK, 131072b data "0/\xe6\xc0\x8e\xe6Uy"... 2020/03/12 16:31:28 rx 218: GETATTR i2 {Fh 0} 2020/03/12 16:31:28 tx 218: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:31:28 rx 219: READ i2 {Fh 0 [393216 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:31:28 tx 219: OK, 131072b data "\xf5\xe3\xc2Έ\xcfx\x8d"... ... Here, like with jacobsa/fuse, the kernel issues 128K read requests to the file and open request once in a while after file was completely read. However unlike jacobsa/fuse there are many extra GETATTR requests beause the kernel rechecks whether file size was changed due to AUTO_INVAL_DATA passed to it in init. If I tell the kernel to trust the program to invalidate its cache explicitly(+), flood of GETATTR requests is gone and the server starts to benchmark at 3.2 GB/s:
hanwen/go-fuse trace log with AUTO_INVAL_DATA deactivated2020/03/12 16:10:02 rx 1: INIT i0 {7.27 Ra 0x20000 ATOMIC_O_TRUNC,READDIRPLUS_AUTO,ASYNC_DIO,PARALLEL_DIROPS,ABORT_ERROR,BIG_WRITES,DONT_MASK,SPLICE_WRITE,SPLICE_READ,POSIX_ACL,FLOCK_LOCKS,READDIRPLUS,WRITEBACK_CACHE,NO_OPEN_SUPPORT,HANDLE_KILLPRIV,IOCTL_DIR,AUTO_INVAL_DATA,ASYNC_READ,POSIX_LOCKS,EXPORT_SUPPORT,SPLICE_MOVE} 2020/03/12 16:10:02 tx 1: OK, {7.27 Ra 0x20000 ASYNC_READ,PARALLEL_DIROPS,BIG_WRITES,NO_OPEN_SUPPORT,READDIRPLUS 0/0 Wr 0x10000 Tg 0x0} 2020/03/12 16:10:02 rx 2: LOOKUP i1 [".go-fuse-epoll-hack"] 20b 2020/03/12 16:10:02 tx 2: 2=no such file or directory, {i0 g0 tE=0s tA=0s {M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:10:02 rx 3: CREATE i1 {0100100 [TRUNC,WRONLY,CREAT,0x8000] (022)} [".go-fuse-epoll-hack"] 20b 2020/03/12 16:10:02 tx 3: OK, {i18446744073709551615 g0 {M0100644 SZ=0 L=1 0:0 B0*0 i0:18446744073709551615 A 0.000000 M 0.000000 C 0.000000} &{18446744073709551615 0 0}} 2020/03/12 16:10:02 rx 4: POLL i18446744073709551615 2020/03/12 16:10:02 tx 4: 38=function not implemented 2020/03/12 16:10:02 rx 5: FLUSH i18446744073709551615 {Fh 18446744073709551615} 2020/03/12 16:10:02 tx 5: 34=numerical result out of range 2020/03/12 16:10:02 rx 6: RELEASE i18446744073709551615 {Fh 18446744073709551615 WRONLY,0x8000 L0} 2020/03/12 16:10:02 tx 6: 34=numerical result out of range 2020/03/12 16:10:05 rx 7: LOOKUP i1 ["file.txt"] 9b 2020/03/12 16:10:05 tx 7: OK, {i2 g0 tE=0s tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:10:05 rx 8: OPEN i2 {O_RDONLY,0x8000} 2020/03/12 16:10:05 tx 8: OK, {Fh 0 } 2020/03/12 16:10:05 rx 9: READ i2 {Fh 0 [0 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 9: OK, 131072b data "\x1f\x8b\b\x00\x81\x95C^"... 2020/03/12 16:10:05 rx 10: READ i2 {Fh 0 [131072 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 10: OK, 131072b data "\x8ex\xe5\xed\x9d\xed\xc1o"... 2020/03/12 16:10:05 rx 11: READ i2 {Fh 0 [262144 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 11: OK, 131072b data "0/\xe6\xc0\x8e\xe6Uy"... 2020/03/12 16:10:05 rx 12: READ i2 {Fh 0 [393216 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 12: OK, 131072b data "\xf5\xe3\xc2Έ\xcfx\x8d"... 2020/03/12 16:10:05 rx 13: READ i2 {Fh 0 [524288 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 13: OK, 131072b data "\"\x0fUQ\xe5D\xf7F"... 2020/03/12 16:10:05 rx 14: READ i2 {Fh 0 [655360 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 14: OK, 131072b data "-\b\x04E#\x8az\xc4"... 2020/03/12 16:10:05 rx 15: READ i2 {Fh 0 [786432 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 15: OK, 131072b data "\xff\xa3b\x12\xa2\xa4i3"... 2020/03/12 16:10:05 rx 16: READ i2 {Fh 0 [917504 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 16: OK, 131072b data "\x03_\x8f\xf4\xdcM\xed\xd9"... 2020/03/12 16:10:05 rx 17: READ i2 {Fh 0 [1048576 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 17: OK, 131072b data "\n\xb6\n\xbbjn\x98\xf1"... 2020/03/12 16:10:05 rx 18: READ i2 {Fh 0 [1179648 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 18: OK, 131072b data "W%\xa5'G\xe5\x8f\xd3"... 2020/03/12 16:10:05 rx 19: READ i2 {Fh 0 [1310720 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 19: OK, 131072b data "\xcfnƲp\xacz\xe4"... 2020/03/12 16:10:05 rx 20: READ i2 {Fh 0 [1441792 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 20: OK, 131072b data "x\x06\xb7\x03|\"\xb7\x03"... 2020/03/12 16:10:05 rx 21: READ i2 {Fh 0 [1572864 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 21: OK, 131072b data "\r\xf1Ib]\a\xe2\x85"... 2020/03/12 16:10:05 rx 22: READ i2 {Fh 0 [1703936 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 22: OK, 131072b data "\xe1\xbcZ\xa8\xbd\xfb\xe8z"... 2020/03/12 16:10:05 rx 23: READ i2 {Fh 0 [1835008 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 23: OK, 131072b data "\xb2\b/\a>\x8a\xf0T"... 2020/03/12 16:10:05 rx 24: READ i2 {Fh 0 [1966080 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 24: OK, 131072b data "|\b/\x94\x1f\u008b\xe4"... 2020/03/12 16:10:05 rx 25: READ i2 {Fh 0 [2097152 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 25: OK, 131072b data "\x15\xdaL\x1c\x1fd\\\x98"... 2020/03/12 16:10:05 rx 26: READ i2 {Fh 0 [2228224 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 26: OK, 131072b data "s\x9f\xf2P\xb7\x13\xecd"... 2020/03/12 16:10:05 rx 27: READ i2 {Fh 0 [2359296 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 27: OK, 131072b data "\x9b\xca\rս^9|"... 2020/03/12 16:10:05 rx 28: READ i2 {Fh 0 [2490368 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 28: OK, 131072b data "s\x9c7sFU\xa4i"... 2020/03/12 16:10:05 rx 29: READ i2 {Fh 0 [2621440 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 29: OK, 131072b data "\xed\x12\"rV\v\xf9\u007f"... 2020/03/12 16:10:05 rx 30: READ i2 {Fh 0 [2752512 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 30: OK, 131072b data "\xb4PZ^i\x8b\x89\xff"... 2020/03/12 16:10:05 rx 31: READ i2 {Fh 0 [2883584 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 31: OK, 131072b data ">%\xbf\a\xcf\xca\xef\xc7"... 2020/03/12 16:10:05 rx 32: READ i2 {Fh 0 [3014656 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 32: OK, 131072b data "\xb6\xaf\xb97i\xf56\xd6"... 2020/03/12 16:10:05 rx 33: READ i2 {Fh 0 [3145728 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 33: OK, 131072b data "\xf8\x8b\x11\xff0\xf5\x17\xe2"... 2020/03/12 16:10:05 rx 34: READ i2 {Fh 0 [3276800 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 34: OK, 131072b data "\b\xde\x166iU^\u007f"... 2020/03/12 16:10:05 rx 35: READ i2 {Fh 0 [3407872 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 35: OK, 131072b data "O8\xe3\xcfg\xfee\xf8"... ... 2020/03/12 16:10:05 rx 104: READ i2 {Fh 0 [12451840 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 104: OK, 131072b data "\x8aߤ\xb1-\xe0\u007f\x1f"... 2020/03/12 16:10:05 rx 105: READ i2 {Fh 0 [12582912 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 105: OK, 131072b data "\xa2\xeb \xa2s[\xe3\x88"... 2020/03/12 16:10:05 rx 106: READ i2 {Fh 0 [12713984 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 106: OK, 131072b data "b\xa9\xe5\xef\xd3\xf2\xaf8"... 2020/03/12 16:10:05 rx 107: READ i2 {Fh 0 [12845056 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 107: OK, 131072b data "\xd3\x05\xbd2\xf7\x19\xf0("... 2020/03/12 16:10:05 rx 108: READ i2 {Fh 0 [12976128 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 108: OK, 128523b data "\x96H!\x84\xa4\xb7\xae\xbb"... 2020/03/12 16:10:05 rx 109: GETATTR i2 {Fh 0} 2020/03/12 16:10:05 tx 109: OK, {tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:10:05 rx 110: FLUSH i2 {Fh 0} 2020/03/12 16:10:05 tx 110: OK 2020/03/12 16:10:05 rx 111: RELEASE i2 {Fh 0 NONBLOCK,0x8000 L0} 2020/03/12 16:10:05 tx 111: OK 2020/03/12 16:10:05 rx 112: LOOKUP i1 ["file.txt"] 9b 2020/03/12 16:10:05 tx 112: OK, {i2 g0 tE=0s tA=0s {M0100644 SZ=13104651 L=0 0:0 B25600*4096 i0:2 A 0.000000 M 0.000000 C 0.000000}} 2020/03/12 16:10:05 rx 113: OPEN i2 {O_RDONLY,0x8000} 2020/03/12 16:10:05 tx 113: OK, {Fh 0 } 2020/03/12 16:10:05 rx 114: READ i2 {Fh 0 [0 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 114: OK, 131072b data "\x1f\x8b\b\x00\x81\x95C^"... 2020/03/12 16:10:05 rx 115: READ i2 {Fh 0 [131072 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 115: OK, 131072b data "\x8ex\xe5\xed\x9d\xed\xc1o"... 2020/03/12 16:10:05 rx 116: READ i2 {Fh 0 [262144 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 116: OK, 131072b data "0/\xe6\xc0\x8e\xe6Uy"... 2020/03/12 16:10:05 rx 117: READ i2 {Fh 0 [393216 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 117: OK, 131072b data "\xf5\xe3\xc2Έ\xcfx\x8d"... 2020/03/12 16:10:05 rx 118: READ i2 {Fh 0 [524288 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 118: OK, 131072b data "\"\x0fUQ\xe5D\xf7F"... 2020/03/12 16:10:05 rx 119: READ i2 {Fh 0 [655360 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 119: OK, 131072b data "-\b\x04E#\x8az\xc4"... 2020/03/12 16:10:05 rx 120: READ i2 {Fh 0 [786432 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 120: OK, 131072b data "\xff\xa3b\x12\xa2\xa4i3"... 2020/03/12 16:10:05 rx 121: READ i2 {Fh 0 [917504 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 121: OK, 131072b data "\x03_\x8f\xf4\xdcM\xed\xd9"... 2020/03/12 16:10:05 rx 122: READ i2 {Fh 0 [1048576 +131072) L 0 NONBLOCK,0x8000} 2020/03/12 16:10:05 tx 122: OK, 131072b data "\n\xb6\n\xbbjn\x98\xf1"... ... In other words what we see is that under more or less equal client conditions jacobsa/fuse performs ~1.8x (1.7GB/s vs 3.2GB/s) slower compared to hanwen/go-fuse. Am I missing something? Kirill (*) --- a/fs/mem.go
+++ b/fs/mem.go
@@ -29,7 +29,8 @@ var _ = (NodeSetattrer)((*MemRegularFile)(nil))
var _ = (NodeFlusher)((*MemRegularFile)(nil))
func (f *MemRegularFile) Open(ctx context.Context, flags uint32) (fh FileHandle, fuseFlags uint32, errno syscall.Errno) {
- return nil, fuse.FOPEN_KEEP_CACHE, OK
+ //return nil, fuse.FOPEN_KEEP_CACHE, OK
+ return nil, 0, OK
} (+) --- a/example/hello/main.go
+++ b/example/hello/main.go
@@ -46,6 +58,7 @@ func main() {
log.Fatal("Usage:\n hello MOUNTPOINT")
}
opts := &fs.Options{}
+ opts.ExplicitDataCacheControl = true
opts.Debug = *debug
server, err := fs.Mount(flag.Arg(0), &HelloRoot{}, opts)
if err != nil {
|
Yes, sure, I could try. Splicing should be unrelated to this benchmark because it only splices data from another file descriptor, not memory buffer we are serving. As expected, the following patch does not make any difference: --- a/fuse/splice_linux.go
+++ b/fuse/splice_linux.go
@@ -12,7 +12,8 @@ import (
)
func (s *Server) setSplice() {
- s.canSplice = splice.Resizable()
+ s.canSplice = false;
+ //s.canSplice = splice.Resizable()
}
// trySplice: Zero-copy read from fdData.Fd into /dev/fuse
@@ -29,6 +30,7 @@ func (s *Server) setSplice() {
// This dance is neccessary because header and payload cannot be split across
// two splices and we cannot seek in a pipe buffer.
func (ms *Server) trySplice(header []byte, req *request, fdData *readResultFd) error {
+ panic("should not be called")
var err error
// Get a pair of connected pipes Lack of writev support does add some overhead, but go-fuse is still far faster than jacobsa/fuse: --- a/fuse/server_linux.go
+++ b/fuse/server_linux.go
@@ -9,6 +9,8 @@ import (
"syscall"
)
+var mem bufferPool
+
func (ms *Server) systemWrite(req *request, header []byte) Status {
if req.flatDataSize() == 0 {
err := handleEINTR(func() error {
@@ -34,7 +36,14 @@ func (ms *Server) systemWrite(req *request, header []byte) Status {
header = req.serializeHeader(len(req.flatData))
}
- _, err := writev(ms.mountFd, [][]byte{header, req.flatData})
+ //_, err := writev(ms.mountFd, [][]byte{header, req.flatData})
+
+ msg := mem.AllocBuffer(uint32(len(header) + len(req.flatData)))
+ defer mem.FreeBuffer(msg)
+ copy(msg, header)
+ copy(msg[len(header):], req.flatData)
+ _, err := syscall.Write(ms.mountFd, msg)
+
if req.readResult != nil {
req.readResult.Done()
}
Whatever it is, I could indeed continue to dig and find the root cause(es) of the slowness myself. Kirill |
I think Aaron has moved on, but I’m keeping this package maintained as best as time permits. I’m using it in distri. If you could indeed identify the slowness and contribute a fix, that’d be great! |
@stapelberg, thanks for feedback; I see. I think I could probably indeed have a deeper look at jacobsa/fuse, when updating base library of my filesystem, but unfortunately not now. |
Cool. I’m not in a hurry, so feel free to come back with more questions/improvements when the time is right. |
@navytux :
Is looking into this still a possiblity? |
@paulwratt, thanks for asking. My application(*) currently uses go-fuse, and there is no neither short-, nor middle-term plans to rework it. (*) https://lab.nexedi.com/nexedi/wendelin.core/blob/61dc1ff2a1ad/wcfs/wcfs.go |
Hello up there.
I'm currently using go-fuse library for my filesystem, but from time to time I'm also rechecking on how it goes with jacobsa/fuse. In 2018 jacobsa/fuse was ~1.6x slower (2) compared to hanwen/go-fuse, and today (2020, Mar.11)
I've found that hanwen/go-fuse improved its throughput, while jacobsa/fuse stays at approximately the same performance level(*). Today, the ratio in speed is ~3x (5.7 GB/s vs ~ 1.9 GB/s).edit: today the ratio is ~1.8x - see #78 (comment).Is there anything I'm missing?
I've looked only briefly and could indeed miss something obvious which could be related to performance or generally. In such a case I appologize for creating the noise.
Thanks beforehand for feedback,
Kirill
/cc @jacobsa, @stapelberg
(*) both at current master and even if I try to patch it to handle requests directly instead of in a goroutine (1, 2).
The text was updated successfully, but these errors were encountered: