-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Possibility to store everything in one file instead of one file per folder #22
Comments
I'd also welcome such a change, it would make this tool really versatile, in its current state I wouldn't use the application, since I don't want to "litter" my whole data structure with all the files. (also, it makes backing up the data even slower, since small file copy is rather slow) Though in my opinion the approach should be different: |
I've updated the FAQ with more details for this question:
That's why I did not consider a central index. |
https://github.com/ambv/bitrot takes the centralized approach but it doesn't seem very actively maintained |
I had an idea for a simple solution that does not require a lot of changes to the code. It's missing some cleanup to remove delete directories from the index but you can test the binaries from the prerelease-artifacts in https://github.com/laktak/chkbit/actions/runs/12039051476
This places a single |
Thanks for still having a look into this. I did a quick test and got the following error:
I've tested with the following command:
|
I removed the flag from the build process. It should work now though I can't test on windows. Please use https://github.com/laktak/chkbit/actions/runs/12052712326 |
Thanks again, but exact same error occurs.
|
Thanks for your help. I was unfamiliar with cgo but it has to do with cross compiling and go would automatically enable it on the source os, which is why I never saw this on Linux. I will look for an alternative to sqlite because it won't allow me to cross compile. |
To have same benefit of above and get rid of multiple index files in sub dirs, In regards of central index file we can have 3 index files in root dir like below and some other suggestions :
Some suggested code flow: |
If anyone want's to give it a try, I have a new version that can be tested: https://github.com/laktak/chkbit/actions/runs/12089414524
|
The indexdb will be placed in the directory that is to be checked. There will be backups available and a json export for long term storage.
Not really on topic and also not sure what you mean with vertical flow but if you want to view it differently you can use |
Thanks for the quick feedback. Good to know that bbolt works on Windows. 1 - The plan is to write to a new database on each run and then move the old to a backup/move in the new one, once finished. For notepad - you are asking for a formatted json. You can get that by running |
Of course this is suddenly more work than I tought ... I made a new iteration for testing
https://github.com/laktak/chkbit/actions/runs/12127789704 |
I mostly finished the db feature (note, the default filename is now
If you'd like to test
If you have feedback, now would be a good time. |
i think these two steps are unnecessarily making thing more complex. I think to let program know db path and where we want to do operations we can simply do: if both paths given then primary path should be considered as path of db and secondary path should be considered as where we want to do operation. |
Maybe the command On the other hand, a separate command to initialise the db is not a big deal. It will be done once in a directory tree that we want to be checksummed. Thank you for the new very useful feature! |
@gstjee I actually thought quite a bit on how to make this easy for the user without explaining how it works. The database needs to keep relative paths to the files. If you specify the database as a parameter the user may think he can just move it without any consequence, however this would break all paths. Making it clear that the database is responsible for all files in its subfolders avoids this. You also can't specify the wrong database file when you have multiple instances since chkbit will find it automatically (you can run operations on any subfolder). It's also harder to forget to backup the database when it's located in the same path. Also when you run an update that specifies just the database folder I can write to a new database to get rid of obsolte indexes. This makes any maintainence shrink operations unnecessary. @y653 by making it a separate command the user is informed where the database file is located. |
I'm a little confused on the expected workflow when using the database. I assumed since the database keeps relative paths, we should include the database in our backup, and then run SOURCE="/mnt/A/Media/"
DEST="/mnt/B"
# update & verify $SOURCE integrity before backup
chkbit --init-db $SOURCE # only done once to create db
chkbit --db update $SOURCE
# backup $SOURCE
rsync $SOURCE $DEST
# post backup, verify $DEST in read-only mode
chkbit --db check /mnt/B/Media
Using indexdb in /mnt/B/Media
EXC index: no such device
Processed 0 files in readonly mode.
chkbit ran into errors:
index: no such device Do we need to initialize (--init-db) on the backup as well? I feel like I must be overlooking something obvious, any pointers would be appreciated. UPDATE: @laktak I may have found the issue. I tried to run It's the first time I have any had any issues using mergerfs, so it did not come to mind initially. Any thoughts on why mergerfs + chkbit are not playing nicely? UPDATE 2: Figured out the issue, although I am not smart enough to fully understand/explain it. My understanding is I may just forgo the db and use the .chkbit files instead. |
Thanks for reporting the issue and for the detailed follow up! I'll make sure to improve the error message when opening the DB. Since the db can be exported to a simple json I'll think about allowing to check against the plain json instead of the db. Or maybe I can come up with a better solution. I'll need to do some performance tests... |
I should add, after more thorough testing with the described workarounds I am not seeing a performance hit. I think there was some funny business going on during my initial tests. And as far as I understand performance penalties can really vary depending on the system, program implementation etc.
It would be interesting to see how simple json performs vs the db on various systems. I could also see it being a good option for people using the db on fuse mounts. There may not be a ton of user yet, but chkbit is such a nicely done tool I could see it getting wide adoption down the road, at least it should! |
I wasn't happy with having a database that isn't easily parsable anyways. JSON is better in this case because it's just text and can be read by other tools in the future. |
I've pushed an update that switches the database from bolt to a single json. bolt is still used but only as a cache. If you want to convert the bolt index from the beta to json, get the binary from this branch first (updated) and run:
Then switch to the latest binary. @vredesbyyrd please let me know how this performs, thanks. |
Very cool. I followed your instructions to convert the BoltDB to json with the binary from this link, then switched to the latest binary. It appears that upon running I deleted the old index and re-initialized, |
@vredesbyyrd sorry, fixed, please use this to convert |
I've pushed an update that removes the Instead you now specify the mode when doing
|
All works great. And I prefer the new way to define which mode to use. |
just released v6 :) |
I would like to have the option to store everything just in one ".chkbit" file instead of one file per folder. Why?: Because on Windows I use a Software which monitors some folders and I can't tell this Software to ignore the ".chkbit" file.
The text was updated successfully, but these errors were encountered: