diff --git a/content/metadata-rework.md b/content/metadata-rework.md new file mode 100644 index 0000000..33c8a73 --- /dev/null +++ b/content/metadata-rework.md @@ -0,0 +1,73 @@ ++++ +title = "Metadata rework" +date = 2024-07-25 +weight = 39 +template = "page.html" +render = true ++++ + +*John Baublitz, Stratis Team* + +Until now, Stratis has had a metadata format that, with a few exceptions, works +remarkably well for our current feature set. However, as we plan for larger features +like RAID, integrity, and online reencryption, we've noticed that we require some +reworks to the order of our devicemapper devices in the storage stack and the addition +of space reservation ahead of time for the metadata of new devicemapper layers. The +metadata rework has been a large effort to pave the way for Stratis moving forward and +will provide the ability for additional flexibility in enabling and disabling certain layers +after pool creation time once the features are added. + +### Changes to encryption + +The modifications to how we handle encryption might be the single largest difference +between V1 and V2 of the metadata. The crypt device now sits above the Stratis metadata +and where RAID and integrity will be in the stack. + +Let's dive into the practical applications of this. + +#### Switch from full disk encryption + +V1 of the metadata provided full disk encryption, but V2 will not. Data, cache, and +thinpool/filesystem metadata will all still be encrypted, but integrity metadata, RAID +metadata, and the Stratis metadata, which is generally just a record of the devicemapper +tables for the Stratis pool, will not be encrypted. + +We discussed this at length with the cryptsetup maintainers and the security team and the +general consensus is that leaving this data unencrypted will not result in leaking any +information about the encrypted data stored above it in the stack. + +#### Higher position in the stack + +The switch of encryption's position in the stack also has a surprising number of benefits. + +Firstly, the operation time on encrypted pools no longer scales linearly according to the +number of devices. Because Stratis can add additional devices to a single pool, FDE required +encrypting each new device with a separate crypt device which resulted in having to pass +through PBKDF once for each device. This increased execution times for operations like unlock +based on how many devices were in the pool. V2 of the metadata has constant time execution time +for each encryption operation that needs to go through PBKDF. + +Secondly, cache is encrypted without additional work simply by nature of encryption being layered +on top of the caching layer. This simplifies the cache handling and requires no special handling +for encrypting cached data. + +Thirdly, this will avoid the problem we would have run into with RAID of encryption amplification. +Because previously each leg would have a separate crypt device, the CPU load of encryption would +have been multiplied by the number of legs in the RAID array. This would result in reduced throughput +due to too much CPU load encrypting each leg separately. + +### Addition of metadata space reservation + +In V2 of the metadata layout, we've begun reserving space for the crypt header, integrity metadata and md-raid +metadata. The end result of this is that moving forward, we will increasingly be able to toggle certain features +on and off after pool creation time. Work has already begun on supporting this for encryption, and we intend to +do this at least for integrity as well. We are still discussing the best way to support this for RAID given +certain restrictions in the case of converting from a multi-disk pool to a RAID array. + +### Backports + +Moving forward, we intend to port all features that are compatible with V1 of the metadata layout back to V1, but +for features that cannot be backported like turning encryption on and off due to the differences outlined here, +we recommend that users migrate data to a V2 pool to take advantage of these upcoming features. + +Please let us know what you think of the new metadata layout in a Github issue!