Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add a command to purge one of the branch of a mergerfs pool #146

Open
donmor opened this issue Apr 23, 2024 · 4 comments
Open

Comments

@donmor
Copy link

donmor commented Apr 23, 2024

Assume there's one of these situation:

  • You want to move a disk with a mergerfs branch on it to another machine, without breaking the pool
  • A disk with a mergerfs branch on it reports a SMART error, thus is going to be replaced as soon as possible
  • Or you just want tot shrink the size of the pool

Then we need to purge all files from the affected branches. I suggest a tool like:

usage: mergerfs.purge[<options>] <branch>...

This tool dup all files on the specified branches to other branches and then,
if specified with -e, delete the original files.

positional arguments:
  branch                    branch to purge

optional arguments:
  -A, --allocate      Try to allocate space before copying files. Aborts
                      if there's no enough space to perform the action.
  -C, --conflicts=    Specify which file to keep if a file with same path
                      and different hash is on target branch. (default: lost)
                      * source   : file from the source
                      * dest     : file on the dest
                      * newer    : file with larger mtime
                      * older    : file with smaller mtime
                      * smaller  : file with smaller size
                      * larger   : file with larger size
                      * mergerfs : file chosen by mergerfs' getattr
                      * lost     : put file from the source to lost+found
                      * ask      : prompt user to make a choice
  -f --force          Force reading unreadable files and put fragments in
                      lost+found.
  -F, include-frags   Include lost+found on the source branch.
  -e, --execute       Execute `rm` commands as well as print them.
  -u, --umount        Unmount path to the source branch if it is a mount
                      point.
  -h, --help          Print this help.

P.S.: There's a PR(#105 ) doing similiar thing, but is revoked.

@trapexit
Copy link
Owner

trapexit commented Apr 23, 2024

I don't believe such a tool is a good idea. Particularly for point 2.

Bad drives need to be handled carefully. You should not be doing anything with the device more than necessary. You should not remove files from it before you are finished copying data for instance. You should remove it from the pool first so there aren't unnecessary reads or writes occurring. You should mount it read only to ensure no writes. All of which are in inherent conflict with the idea of keeping the pool whole while the process is going on.

And if you want to remove a healthy device it is trivial. Create a new temporary pool and don't include the source branch. Then rsync like normal. At most... such a tool would simply automate the creation of the temp pool and kick off an rsync. But even then you might want to set the branch RO or filesystem ro so as to not write new data and you really shouldn't remove anything before it is done copying just in case there is some incident. And at the end if you want it cleared it is faster to just format it.

Given the level of nuance having an app wrap it without that nuance is risky.

@donmor
Copy link
Author

donmor commented Apr 24, 2024

And if you want to remove a healthy device it is trivial. Create a new temporary pool and don't include the source branch. Then rsync like normal. At most... such a tool would simply automate the creation of the temp pool and kick off an rsync. But even then you might want to set the branch RO or filesystem ro so as to not write new data and you really shouldn't remove anything before it is done copying just in case there is some incident. And at the end if you want it cleared it is faster to just format it.

We may let the tool handle more... Firstly remount the underlying FS ro as root (or make the branch ro if the FS is not able to be remounted ro); then dup all files; then remount the pool to drop the source branch; finally, if necessary, format the drive (or remount rw and delete files if is not a mountpoint)

BTW, why making temporary pool? I think it can be done by dupping all files on the branch if no identical files found on other branches, while making it ro. Just like mergerfs.dup --prune but always remove from that specific branch.

@trapexit
Copy link
Owner

BTW, why making temporary pool?

Because people like to control how the files get created and it would be silly to replicate all of mergerfs' behaviors in a tool when you can simply use mergerfs to manage placement like it is designed to do.

@trapexit
Copy link
Owner

Remounting the filesystem live will almost certainly lead to problems if in active use. You'd need to set the branch RO and then wait for all new writes to stop by constantly scanning the filesystem for open files and then hope no new out of band calls come in and then you can safely'ish remount as read only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants