-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FList format #11
Comments
@muhamadazmy can you elaborate on what exactly make the current format hard to use. |
As soon as you want to reach a file in a directory, you need to fetch the whole directory contents, iterate over the list of inode to find the corresponding name, and then you have your inode. You need to do this for any file lookup. Walking is quite easy and fast, but you barely never do that. Caching was a feature needed to not fetch and parse the capnp object each time you try to hit a file in a directory (understand: when you |
I'm not against changing the format, since even for me when writing But in the other side, using one entry per file in sqlite directly will be dramatically worse, for both size and performance. This is not a better idea. For average directory size (let says < 50 files), it will be a lot faster to iterate a list than doing a SELECT query (this needs to be benchmarked, but I'm pretty sure). We could investigate around other existing serialization (like msgpack or anything binary compliant), but I don't know how we can be lookup efficient without exploding memory or storage-size right now. |
So it seems that this is a trade off problem and we need to find the right balance. Now we also need to see what would be the impact on the size of the flist if we change that. See if this stays in the reasonable enough. |
very late comment :P @maxux You might have a point regarding listing a directory. at least for the first time after that the kernel will do a pretty good job caching the directory entries (specially for a ro filesystem). Also note that low level (faster) fuse api uses inodes, so on accessing files u will probably need to directly retrieve the file info from the db (using inode as key) which is going to be much faster and efficient rather than first loading it's parent directly and then traverse a list (that can be really long) to retrieve the file object. I don't believe change the format this way will actually affect the size that much, but it's definitely going to increase the runtime performance dramatically specially for traversing the files tree, and opening files (may be not reading) |
It's time we move away from the current flist structure.
May be use sqlite to it's full potential to optimize file query. An entry per directory is very inefficient for file access, and make it very hard to implement the fuse layer using the inode api (although i did it in the rust implementation) i had to implement a caching layer (and had to do lots of data copying) to make it work properly.
I suggest something like
Entry can still be a capnp object but with a more simplified schema
The text was updated successfully, but these errors were encountered: