Create a benchmark suite for storage drivers #7300

DemiMarie · 2022-02-23T00:18:01Z

Qubes OS release (if applicable)

N/A?

Brief summary

There needs to be an answer to whether LVM or BTRFS storage pools are faster.

crat0z · 2022-03-09T15:22:58Z

I have recently upgraded SSDs, so I needed to reinstall Qubes. Seeing this issue, I figured I should try out both LVM thin pools and btrfs. However, I really don't know what I am doing but I hope my collected data could be of help/interest.

For reference, I have a Ryzen 5 4650U, 16GB of RAM and a 970 EVO PLUS. Both installations were default, with encryption enabled. After installation, I simply installed kdiskmark in both dom0 and the fedora template. I did not update otherwise.

To benchmark, I used kdiskmark in dom0 and an AppVM with the "peak performance" and "real world performance" presets. For qvm-copy-to-vm, I changed the policy to not ask before copying the files, and I copied a 40GB file of zeros, 10 thousand files (4MB each) of zeros, and similarly with random data instead. Lastly, I timed the start up of a VM 25 times.

Results

thin pools

(don't mind the error in the terminal I forgot to rename the folder..)

real 5.05 user 0.03 sys 0.01
real 4.88 user 0.03 sys 0.00
real 4.82 user 0.03 sys 0.01
real 4.86 user 0.03 sys 0.01
real 4.83 user 0.03 sys 0.00
real 5.04 user 0.03 sys 0.01
real 4.80 user 0.03 sys 0.01
real 4.61 user 0.03 sys 0.01
real 4.84 user 0.03 sys 0.00
real 4.91 user 0.03 sys 0.01
real 4.84 user 0.03 sys 0.01
real 4.73 user 0.03 sys 0.01
real 4.91 user 0.03 sys 0.00
real 4.66 user 0.03 sys 0.00
real 4.87 user 0.03 sys 0.00
real 4.94 user 0.03 sys 0.01
real 4.82 user 0.03 sys 0.00
real 4.89 user 0.03 sys 0.01
real 4.76 user 0.03 sys 0.00
real 4.87 user 0.03 sys 0.01
real 4.85 user 0.02 sys 0.01
real 4.86 user 0.03 sys 0.00
real 4.86 user 0.03 sys 0.00
real 4.77 user 0.03 sys 0.01
real 4.94 user 0.03 sys 0.00

btrfs

real 4.59 user 0.03 sys 0.01
real 4.75 user 0.03 sys 0.01
real 4.71 user 0.03 sys 0.00
real 4.64 user 0.03 sys 0.01
real 4.72 user 0.03 sys 0.00
real 4.49 user 0.03 sys 0.01
real 4.59 user 0.02 sys 0.01
real 4.74 user 0.03 sys 0.00
real 4.64 user 0.03 sys 0.00
real 4.75 user 0.03 sys 0.01
real 4.82 user 0.03 sys 0.00
real 4.76 user 0.03 sys 0.00
real 4.64 user 0.03 sys 0.01
real 4.89 user 0.03 sys 0.01
real 4.88 user 0.03 sys 0.00
real 4.76 user 0.03 sys 0.01
real 4.80 user 0.03 sys 0.00
real 4.64 user 0.03 sys 0.00
real 4.69 user 0.03 sys 0.01
real 4.65 user 0.03 sys 0.00
real 4.76 user 0.03 sys 0.00
real 4.61 user 0.03 sys 0.01
real 4.82 user 0.03 sys 0.01
real 4.83 user 0.03 sys 0.00
real 4.68 user 0.02 sys 0.01

Observations

Not sure what to make of the btrfs dom0 results. Running kdiskmark in both dom0 and appvm stresses the SSD a lot, one sensor would report 50-60C and another would report 60-75C, so I could be thermal throttling. However, from my brief reading online this should only reduce performance by about 20%. As far as I recall, qvm-copy is and has always been CPU bound, so the results might not be as meaningful.

DemiMarie · 2022-03-09T20:21:10Z

Looking at the benchmark results, it appears that in the real-world test BTRFS is faster for everything except sequential writes, while LVM2 has better results across the board in the peak performance test. I believe one cause of the weird behavior is that Qubes OS does not use direct I/O for the loop device, which causes all sorts of performance problems.

tlaurion · 2022-03-21T13:49:50Z

@DemiMarie another thread was opened on the forum is the use case of simply reencrypting LUKS and seeing speed varying from 50MiB/s to 150 MiB/s on commodity SSD on my side.

This is linked to assumptions the different tools are taking from what is reported from hardware (disk block size), partition table, partition alignment and block size of create partition.

To summarize, only modifying the LUKS sector size at LuksFormat step at partitioning has big impacts on thr rest of performance from there.
The basic culprit seems to be the hardware not reporting real physical block size, cryptsetup 2.4.0 being the first version detecting things properly (we don't have it) and other fs tools taking their own non-optimized decisions when it comes to initial partitioning. Doing some automatized tests there to validate speeds of sector size, block size, erase block size alignment and partition alignements are improving by a huge factor. Some user reported 2x speed improvement when aligning with Erase Block size when doing manual partition alignment for TLC based SSD drive.

The point here is not to go advanced necessarily, but trusting hardware in reporting the right thing, ie 512 block size nowadays, should probably be followed by testing minimally if 4096 is actually better prior of accepting that reported value as being the truth.

The thread is here and needs more testing: https://forum.qubes-os.org/t/ssd-maximal-performance-native-sector-size-partition-alignment

DemiMarie · 2022-03-21T14:06:19Z

@marmarek is there any chance we could ship a more recent version of cryptsetup?

tlaurion · 2022-03-21T17:41:47Z

@DemiMarie @marmarek that cryptsetup in 2.4.0 might not be enough, testing needed, while real problem is what is actually reported from device lying on block sizes, and might result in same wrong decision from cryptsetup (and other tools down to the partition table creation, partition alignment of LUKS and down to filesystem (LVM/others).

Misalignment resulting in more blocks needing to be read/wrote, and in SSD firmware having to compensate and touch regions which otherwise would not need to be, resulting in speed differences easily observable from cryptsetup-reencrypt, but where filesystem performance (not filling ssd cache) will not show consequences.

DemiMarie · 2022-03-22T09:42:17Z

@DemiMarie @marmarek that cryptsetup in 2.4.0 might not be enough, testing needed, while real problem is what is actually reported from device lying on block sizes, and might result in same wrong decision from cryptsetup (and other tools down to the partition table creation, partition alignment of LUKS and down to filesystem (LVM/others).

Sadly I am not aware of any solution other than fixing each of these tools separately 😞. Is there any way to find out the erase block size?

crat0z · 2022-03-26T05:20:28Z

After spending some time on btrfs, I cannot use it myself. It could be related to discards as mentioned in one of the other issues which I've lost now, but btrfs has a tendency to just murder IO on my laptop. VMs become completely unresponsive if the drive is doing something for long enough.

For example, I tried syncing the Monero blockchain, and once I hit about 75 Mbps average download speed, VMs would just start hanging intermittently. Terminals don't respond, new processes won't create, Qubes services in dom0 will freeze too.

I also tried enabling direct IO in /etc/xen/scripts/block and unfortunately it didn't help the situation. I do not have any results, but they were pretty much identical to before.

rustybird · 2022-03-31T15:48:09Z

I tried syncing the Monero blockchain, and once I hit about 75 Mbps average download speed, VMs would just start hanging intermittently. Terminals don't respond, new processes won't create, Qubes services in dom0 will freeze too.

Try running filefrag on the private.img if you still have it around. monerod causes some truly monstrous fragmentation, easily tens of millions of extents after a long sync. (It's also the only blockchainy thing I've noticed corrupting its own data on crash.)

btrfs filesystem defragment private.img can do wonders here. Unfortunately it blows up the space used for shared data, so you want to get rid of all that first: Basically - shut down the VM, delete private-precache.img, delete all private.img.*@* revisions, delete relevant subvolume snapshots if you have any, consider deleting any clones of the VM, and then run defrag.

tlaurion · 2025-01-29T15:34:13Z

Note progress of QubesOS/qubes-core-admin#649

DemiMarie added T: task P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. labels Feb 23, 2022

andrewdavidwong added the C: storage label Feb 23, 2022

andrewdavidwong added this to the Non-release milestone Feb 23, 2022

DemiMarie mentioned this issue Mar 9, 2022

Use direct I/O for loop devices #7332

Open

andrewdavidwong removed this from the Non-release milestone Aug 13, 2023

andrewdavidwong added T: enhancement and removed T: task labels Mar 8, 2024

andrewdavidwong removed the T: enhancement label Jan 17, 2025

tlaurion mentioned this issue Jan 29, 2025

tests: add simple storage performance tests QubesOS/qubes-core-admin#649

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a benchmark suite for storage drivers #7300

Create a benchmark suite for storage drivers #7300

DemiMarie commented Feb 23, 2022

crat0z commented Mar 9, 2022 •

edited

Loading

DemiMarie commented Mar 9, 2022

tlaurion commented Mar 21, 2022 •

edited

Loading

DemiMarie commented Mar 21, 2022

tlaurion commented Mar 21, 2022 •

edited

Loading

DemiMarie commented Mar 22, 2022

crat0z commented Mar 26, 2022

rustybird commented Mar 31, 2022

tlaurion commented Jan 29, 2025

Create a benchmark suite for storage drivers #7300

Create a benchmark suite for storage drivers #7300

Comments

DemiMarie commented Feb 23, 2022

Qubes OS release (if applicable)

Brief summary

crat0z commented Mar 9, 2022 • edited Loading

Results

thin pools

btrfs

Observations

DemiMarie commented Mar 9, 2022

tlaurion commented Mar 21, 2022 • edited Loading

DemiMarie commented Mar 21, 2022

tlaurion commented Mar 21, 2022 • edited Loading

DemiMarie commented Mar 22, 2022

crat0z commented Mar 26, 2022

rustybird commented Mar 31, 2022

tlaurion commented Jan 29, 2025

crat0z commented Mar 9, 2022 •

edited

Loading

tlaurion commented Mar 21, 2022 •

edited

Loading

tlaurion commented Mar 21, 2022 •

edited

Loading