-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic sharding of big directories #7022
Comments
I'd really like to enable sharding by default in js-IPFS to reduce the amount of codepaths and A/B tests.
Can go-IPFS take the same option? That should tick off the first item. cc @aschmahmann |
We really should fix this. js-ipfs shards dirs >1k of items out-of-the-box and go-ipfs fails flat. On top of that, the biggest problem in go-ipfs right now is that when you enable sharding, it shards EVERYTING, so even if you have a small dir under 1k of items, its still sharded, which produces different CIDs. I believe a lot of work happened in IPLD team since we implemented HAMT-sharding in go-ipfs, would be useful to leverage that work when we make sharding enabled by default in go-ipfs. @ribasushi @warpfork are there any libs / specs related to improved sharding in IPLD/Filecoin which we should be aware of in planning here? |
Yes!
|
Doesn't a HAMT already "shard" or split the data among multiple blocks, depending on its parameters like bitWidth and bucketSize? If one were to add sharding at the ADL layer on top of go-ipld-adl-hamt, would you not end up with two layers of sharding? |
iiuc I think the point of the "shardingSplitThreshold" mentioned above is that there's some value below which any sharding should be skipped, and a simple inline map (potentially, not even a single new block) should be used. So what I meant by "don't shard below $X entries" is like: less than 1000 (or $X) entries, don't invoke a hamt at all and just use a regular plain map in the same block. Or in other words, I'm treating "sharding" as roughly an alias for using a HAMT (or some other ADL that serves a similar purpose to HAMTs). My statement about this and how it relates to ADLs is: whatever that condition is for whether to invoke the full HAMT (and all its sharding logic) vs just do an inline plain map... is something that I think should be placed within the boundary of what we call ADLs. Doing so means we have a place to identify what that $X condition is: it's the same place we identify the ADL as a whole. This doesn't mean the problem goes away, but it means two problems can solved at once in one place: we already have the "identify the sharding logic needed here" problem; we might as well combine that with the "what is the shard vs inline threshhold and how do we identify that logical conditon" problem. |
Ah, gotcha, that makes sense. I wonder if we could expose that as a generic ADL layer, e.g. "use a basic map if size < N, otherwise use this fancy multi-block map prototype". |
JS implementation is https://github.com/rvagg/js-ipld-hashmap which is algorithmically backed by https://github.com/rvagg/iamap, but we're a little out of sync across our HAMT implementations (and spec) as we've tried to (a) integrate Filecoin's formats into our specs and ideally enable our implementations to speak that format, and (b) improve based on some learnings. It wouldn't be hard, but if we want to put the pedal down on UnixFS v2 and a better directory map->hamt version/implementation then we would need to budget the work and get it all sorted out. Otherwise, if we're not intending to reach for UnixFS v1 it seems like we should be using the HAMT stuff that's supposed to come with MFS. I'm not really over that work or how it might differ from the HAMT work we've done outside of MFS and how standardised it is and whether we can break the layout to do it betterer? |
Yes - js-IPFS uses hamt-sharding* and not ipld-hashmap (I didn't know about this new package, if it generates the same HAMTs as go-IPFS maybe you could open a PR refactoring the unixfs-importer and mfs internals to use this instead).
Neither did js-IPFS TBH, I think the 1k figure for
I want UnixFS v2 but I think tying the resolution of this issue to V2 would be a mistake as it's a much bigger job with an unclear end point as mentioned in the OP. A more expedient path would be for go-IPFS to just add the The js-IPFS import logic will use a regular UnixFS directory until it hits the threshold, then converts that to a shard once the threshold is passed. The MFS implementation then uses this mechanism to convert existing directories to HAMT shards when adding members takes you over the threshold, and also turns shards back into regular directories when removing members takes you below it. * |
Excuse my ignorance but I'm not quite grokking it from this thread and this thread is already breaking my assumptions about how it currently works - but don't go-ipfs and js-ipfs both implement the same HAMT algorithm and block structure for large directories? Or is this issue actually saying that go-ipfs doesn't have such support and only js-ipfs does, and therefore the use of the HAMT is not actually standardised or specified? |
Quick comment on this that's on my mind right now -- Whatever threshholds are used here... well, there's an interesting tension in what metric to use as an input to the decision.
I think we should steer towards Option 1:
|
@iand: We want to focus on "Make sure we don't shard too soon. At the moment, the cutoff to start sharding is very low." above. @aschmahmann will provide pointers on where that code is. |
Maybe option 1 is a little nicer in the longer run, but going with option 2 (<=1MiB -> Regular Directory, >1MiB -> Sharded Directory) is likely to save us some headaches by limiting the number of people using UnixFSv1 Sharded Directories to the people who actually need to use them and leaving everyone else alone. There are still a few bugs around sharded directories (e.g. #8072), so given that we want to flip this on by default let's only expose the issues to the people who need to be exposed to them. Note: If we did option 1 we'd either have to reject file/folder names over a certain size (a potentially breaking change) or still have to serialize to check for block size because if our estimations are wrong then someone could create a block that's too big for Bitswap to deal with. |
One unavoidable (I think) issue is that, given a sharded directory, "guess and check" to see if we should shrink to a non-sharded directory might require downloading a large portion of the directory. This could be... annoying. However, other than that, guess and check isn't actually too bad, performance wise. We can implement 1 as follows:
Most directories will either be small or large, so this should be pretty performant. There's also a third option: two limits:
The non-sharded node could grow above 256KiB, but won't grow above 1MiB. |
There was a brief sync discussion around the three options listed above IIRC here were the main points:
|
This is being done in #8106 |
Commenting here at the request of @warpfork to discuss how we implemented this in Peergos. We have a structure to publish capabilities which is semantically a map<path, cap>. This is stored in a champ which is an inode based file system (inspired by chats with @whyrusleeping). So the keys in this outer champ are inodes (an integer + path element), and the values are DirectoryInodes. A DirectoryInode can either be an inline list of child inode caps, or a nested champ of them. In our case the logic we went with is a maximum number of children before switching to champ which is decided by the maximum size of an inode cap. Our caps are < 256 bytes and our path elements are < 256 bytes (to cover compatibility with all common file systems). So in our case this leads to a fixed max number of inlined children of max block size / 512. In our case this is further constrained by the outer champ which could itself inline multiple directories into a single champ node. So we have to divide by the branching ratio of the outer champ. |
While we wait for automatic sharding / unsharding to get figured out, @Stebalien @aschmahmann is there anything that would prevent an API / CLI parameter for something like this? This way users could simply tell us whether or not they wanted to shard things out on a per upload basis and IPFS doesn't need to do any additional lifting. (or we could automatically if we notice the directory has a certain amount of files in it before uploading to IPFS). |
Unfortunately, go-unixfs had sharding at a global level (as surfaced in the config https://github.com/ipfs/go-ipfs/blob/master/docs/experimental-features.md#directory-sharding--hamt). This issue is currently slotted for go-ipfs v0.10.0 anyway, so while we could work around doing this automatically it's not something we currently have a good reason to invest in. |
Ahh if this is coming out in 0.10.0, then we'll definitely just wait for that. Thanks for the info! |
This is in v0.11.0 |
This is a tracking issue for the sharded directory feature. This feature can be enabled with:
TODOs:
Ideally, we'd fix this issue by switching to unixfs 2.0 and a generalized sharded map implementation. However, that's probably not going to happen in the near future and this feature is used widely enough that we'll likely have to stabilize it in its current form.
The text was updated successfully, but these errors were encountered: