Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation and Context
The earlier implemented
zfs rewrite
functionality for simplicity updated logical birth times of all rewritten blocks. It makes them look modified from perspective of replication, snapshot diffs, etc, even though the actual user data remain the same. While some people found it useful to recover corrupted remote backups, for majority replication of large extra amounts of logically unchanged blocks can be a huge waste of time and resources.Description
This PR implements a new variation of rewrite, called "physical rewrite", controlled by the new
-P
argument to thezfs rewrite
subcommand. When possible, it tries to keep logical birth times unchanged. It allows to distinguish blocks that were just relocated within a pool from blocks that were actually modified by users. While the first may occupy additional disk space due to snapshots, block cloning, etc, that should be accounted as such, they should be ignored by replication, etc.Previously we've had block pointers with physical birth times bigger than logical birth times only as result of device removal remap process. But in that case space usage accounting was still based on block's logical birth times. Since physical rewrites require space reallocation accounted based on the physical birth times, to differentiate those two cases this PR introduces new "R"/"rewrite" flag in the block pointer structure. When set, it means the block's space accounting should use physical birth time instead of traditional logical birth time. Since read-only pool imports do not really care about space accounting, the new per-dataset pool feature "physical_rewrite" gating this is declared as read-compatible. The feature will be activated on first use and deactivated when last of affected datasets is deleted.
There are two exceptions when logical birth time might still be modified around physical rewrite:
Now that we have different birth times in block pointers, traversal code got new
TRAVERSE_LOGICAL
flag, allowing to choose between traversing only logical changes (replication, diff, etc), or physical changes (scrub/resilver, dataset destroy, etc).How Has This Been Tested?
Several successful CI runs. Manual testing with
zfs rewrite
andzfs rewrite -P
vszfs send -i
.Types of changes
Checklist:
Signed-off-by
.