-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
off2 is sometimes very slow (processes in IOWAIT) #263
Comments
Lots of processes in iowait https://www.computel.fr/munin/openfoodfacts/off2.openfoodfacts/cpu.html |
@cquest Are drive the same ? Or maybe one has more constraints ? |
Could be irrelevant but is it possible that some slow or inaccessible devices are in the PATH? |
I don't see anything special. As root on off2: |
I did a small test with dd to compare with my laptop (following this article): On off2: dd if=/dev/zero of=/home/alex/test.img bs=1G count=1 oflag=dsync
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 0,995176 s, 1,1 GB/s
dd if=/dev/zero of=/home/alex/test.img bs=512 count=1000 oflag=dsync
512000 bytes (512 kB, 500 KiB) copied, 17,5354 s, 29,2 kB/s On off1 dd if=/dev/zero of=/home/alex/test.img bs=1G count=1 oflag=dsync
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.60762 s, 668 MB/s
dd if=/dev/zero of=/home/alex/test.img bs=512 count=1000 oflag=dsync
512000 bytes (512 kB, 500 KiB) copied, 8.94754 s, 57.2 kB/s On my laptop: dd if=/dev/zero of=/home/alex/test.img bs=1G count=1 oflag=dsync
1073741824 octets (1,1 GB, 1,0 GiB) copiés, 1,44455 s, 743 MB/s
dd if=/dev/zero of=/home/alex/test.img bs=512 count=1000 oflag=dsync
512000 octets (512 kB, 500 KiB) copiés, 4,1427 s, 124 kB/s So we have a multiplication by two on write latency om off2 compared to off1, but not more. This is just a write test… might not be very interesting, as perf problems seems more on read. |
read test (but I can't be sure if there is no cache, and don't want to invalidate all cache ! I try with the nocache option) off1: dd if=/srv2/off/html/data/openfoodfacts-mongodbdump.gz of=/dev/null bs=8k
...
7989754834 bytes (8.0 GB, 7.4 GiB) copied, 51.0625 s, 156 MB/s
dd if=/srv2/off/html/data/openfoodfacts-mongodbdump.gz iflag=nocache of=/dev/null bs=8k
7989754834 bytes (8.0 GB, 7.4 GiB) copied, 39.3774 s, 203 MB/s off2:
|
Another example: 6 seconds to do a grep on 2 small files:
|
|
fail2ban nginx-bot and nginx-http were doing a lot of reads and writes and @alexgarel disabled them, but we still have very high usage of the disks:
after removing fail2ban nginx bots,
|
set atime=off on all volumes on hdd things look much better:
|
Causes of I/O issues on off2 have been identified:
I propose to close this issue and open more specific new ones if needed. |
Running some scripts on off2 sometimes takes much more time than usual, I have been testing some import scripts and everything seems sluggish compared to off1.
e.g. to start some Perl processes, it sometimes takes 15 seconds, other times 2 minutes:
[5 minutes later]
I'm guessing in the first run, we were just waiting for the disks for 2 minutes.
Same thing happened with systemctl start apache2@off which timed out once then was able to start the next time.
The text was updated successfully, but these errors were encountered: