-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create tool to vacate drive #55
Comments
I suppose a tool could be created. I've always just created a second pool which excludes the drive in question then rsync into that pool. Allows for picking the specific policy which may be necessary since you probably want to keep the existing policy while you continue to use the pool but want data spread differently when writing. That becomes far more complicated with an app since it'd need to replicate all that builtin logic. mergerfs.balance ignores the pool's policies and just rsyncs files around underneath mergerfs. |
I don't know enough about how everything works, but here is one thought. What if the vacate tool was a variant of balance that created large zero-fill temp files on the drive to be vacated which would be skipped during the balance phase. Then the cycle would look like this:
One could probably do this manually, but having it all in one tool would be slick. Maybe not the best approach, but would it work? |
I really don't follow. The balance tool works independent of mergerfs outside querying it for the drives. It ignores the policies the pool have completely. It's a simple way to redistribute files when you don't care at all which drive they end up on. Vacating a drive is different. At least it always is for me. I more often than not want to obey the policies. Which means I need to use mergerfs to make the decisions. That can only be done by creating another mount and moving files into it from the drive in question. If you don't care about where files end up then changing the balance tool to use the drive as the source rather than the drive with the most used space is easy. But in my opinion so is removing the drive from the pool and running rsync. My point is that creating a tool to do something that is somewhat custom for each user and not all that difficult to do by one's self means creating a somewhat complicated tool. I'm not opposed to creating a tool. I'm just not clear the specific usecase. |
I'm trying to find a way to do it while "online". Removing the drive from the pool effectively takes that data "offline" until the rsync is done. |
That's why I said use a second pool with the drive missing then just rsync between the drive in question and the new pool. |
Let me see if I understand:
|
No. Not rsync from pool1 to pool2. Rsync from excluded drive to pool2. |
This is generally documented in my extra docs. https://github.com/trapexit/backup-and-recovery-howtos/blob/master/docs/recovery_(mergerfs).md Thats to replace a drive straight up. If you want files distributed in some way requires the second pool. |
Thanks. I missed that because I didn't think of it as recovery. |
I would like to vote for this as well. I've read the link above and understand how to add another drive, and move data over. I suspect to just "vacate" a drive, I would:
I think the OP's original thought would be for mergefs itself to handle this, if you remove the drive. Like an option or flag to move all the data. However, I also see the author's point: that mergefs is just a proxy service or sort. And there are these tools here in this repo for more out-of-band functionality. one could write a simple tool (and PR it here) for exactly that. maybe I think it would come down to this: If But right now, reading the
Disclaimer: I just found out about |
I'm doing it right now.
during the process a part of my data is missing. A tools to empty a disk from a pool to the other disks of the pool could be helpful. |
The biggest thing that needs to be done to make it practical is the ability to get all options and create a list of them compatible for a mounting as the only good way to manage this behavior is to create a second, temporary pool without the drive in question and then rsync. Shouldn't be hard just needs to be done. |
Hello! Well I understand the need here. Being a long time user of Stablebit drive pool on Windows, currently in the process of going to Linux with MergerFS it's indeed a solution that would be awesome! In Stablebit drive pool you have a list of your disks, you click "remove" on one of them and the software manage to empty your drive (Checking at the same time if he's got enough space on the others, of course), then run the process of moving all the files from the disk you want to remove. It the meanwhile all files remain available during the whole process, ensuring king of 99.999% uptime of your storage system. If not possible, I would say that at list the process described here...
...should be manage by MergerFS. Because let's say someone or a process is writing on pool1 during the step 2) and MergerFS is then writing on the disk you're trying to remove? In the end only MergerFS could say "I'm in the middle of moving files so I lock this disk in writing/adding files to it, just read only". I know that doing a small rsync after 2), like "2.5) Rsync --append-verify from excluded drive to pool2" could do the trick, but it's not a clean solution IMO, especially on disks where files are very often written. Btw @trapexit I know you heard it a million times, but I tell you: MergerFS is awesome and well written! That's why we want to add features to it, the "price of success" :-) |
As do I. I'm in the process of doing this very thing and have done so many times over the years. The issue is that 1) mergerfs itself does not have any active logic like Stablebit does. It does not actively monitor filesystems and act. It is entirely reactive. So it would require building all that logic. And 2) I believe it is risky and bad practice to do moves like that. Often people vacate drives that are under duress or damaged. To remove data from it is riskier than simply copying.
You're mistaken about the workflow. You absolutely can keep mergerfs from writing to drives. That's what the "RO" setting for branches is for. You update the main pool and set it RO. And if you're concerned about writes to existing files then remount read-only. Having mergerfs attempt to synchronously manage this workflow just doesn't make sense to me. It is perfectly straight forward to do out of band. It is straight forward and trivial to automate the process. I've just not gotten to it as there are a bunch of other things I've been working on and this process is pretty straight forward to do manually. |
It would be helpful to have a tool that could be used to vacate a drive in a pool to prepare it for removal or replacement. Assuming the pool has sufficient capacity to hold the files from the target drive, the utility would redistribute the files in a similar manner to mergerfs.balance.
If there is already in-built support for this, I'm happy to hear about it.
Thanks!
The text was updated successfully, but these errors were encountered: