Implement sharding based on Rendezvous Hashing #238

meroton-benjamin · 2025-02-03T22:22:48Z

As part of buildbarn/bb-adrs#5 we are attempting to improve the resharding experience. In this pull request we have an implementation of sharding based on Rendezvous Hashing where shards are described by a map instead of an array.

EdSchouten · 2025-02-04T09:46:05Z

pkg/blobstore/sharding/rendezvous_shard_selector.go

+	x = x - ((x|-x)>>63) + (((^x)|-(^x)) >> 63)
+	frac := float64(x)/float64(^uint64(0))
+	return 1.0/-math.Log(frac)


I'm trying to understand the math behind this.

Why do we need the clamping? I understand math.Log(0) doesn't work as expected, but what's wrong with MaxUint64?

Why do we need the division by MaxUint64? Considering that we're applying a log() afterwards, wouldn't that be the same as computing the log() first, but then subtracting a constant?

Why do we need the natural logarithm? Would the algorithm work equally well with log2()?

If log2() works, would there be any way to simply approximate the result without relying on any floating point math?

MaxUint64 would give log(1)=0 which would be a division by zero.

However this is ieee floating point math and not real math, weight/-log(1) = -Inf which would be infinitely undesired, while while we would expect it to be infinitely desired. Thinking more about it log(0) seems to be defined as -Inf which would make the quota weight/-(-Inf) = +0 in IEEE math which is the result we actually want. If my math is correct then this means the code actually handles the extreme values in a somewhat decent manner with the MaxUint64 essentially wrapping around.

If we replace f(x) = 1/-Log(x/MaxUint64) with g(x) = 1/(Log(MaxUint64+1) - Log(x)) we should get the desired result, g(MaxUint64) = +Inf, g(0) == 0

The original proof uses log of a random uniform ]0,1[ distribution, so approximating the uniform distribution with random_uint64 / max_uint64 we could do the equivalent instead of 1.0/(log(max_uint64)-log(x))

The original proof uses the natural logarithm, log2 is just a constant away from log and we're only interested in their relative ordering rather than the score itself so we can use log2 without any difference.

There are some possible log2 bit trickery hacks, such approximating the logarithm by its integer part (i.e. number of set bits) + a polynomial that approximates its fractional part, technically I think this is the actual implementation of the the standard log algorithm just that the chosen polynomial is of high enough order to give the 15 or so bits of precision possible in float64.

Since we don't have to support the entire range of positive float64s and we don't need the full precision we could probably get something slightly faster, maybe by abusing the float64 mantissa in it's binary representation. I'll have a look if I can get something which produces sufficient precision faster.

EdSchouten · 2025-02-04T09:48:30Z

pkg/blobstore/sharding/rendezvous_shard_selector.go

+	for keyHash, weight := range s.weightMap {
+		mixed := splitmix64(hash^keyHash)
+		current := float64(weight) * score(mixed)
+		if current > best {
+			best = current
+			bestKey = s.keyMap[keyHash]
+		}
+	}


Because we are simply iterating over weightMap, I don't think we actually need to use maps here. Have you considered using a simple slice?

type rendezvousShard struct { keyHash uint64 weight uint32 key string } type rendezvousShardSelector struct { shards []rendezvousShard }

Assuming this works, is it then still necessary to change ShardingBlobAccess to take a map? This function could return an index of the shard like it did before.

In the unlikely event of two backends getting the same current value, it is by random which one wins when iterating over a map. Maybe the configuration should stay as a map to get the unique identifiers but internally it can be a slice.

The requirement is more that the keys should be different. I think a slice should work, but during setup it might be worth checking for duplicated key hashes because loosing one storage shard without knowing why is pretty annoying.

The implementation can iterate over a slice of shards, that would make the tiebreaker in case of a collision result into a stable value (biased towards earlier in the slice) and significantly reduce the changes in sharding_blob_access.go.

I still think it's better to replace the interface of ShardingBlobAccessConfiguration.Shards with ShardingBlobAccessConfiguration.ShardMap since a gives a natural interface for removing and inserting new shards compared to an array.

We could also error out at startup when there is a hash collision between two shards, it should never happen but maybe erroring out is preferable to ignoring an entire shard (which would be the effect of two shards sharing a hash).

Switch sharding struct from array to map

f0b6827

EdSchouten reviewed Feb 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement sharding based on Rendezvous Hashing #238

Implement sharding based on Rendezvous Hashing #238

meroton-benjamin commented Feb 3, 2025

EdSchouten Feb 4, 2025

meroton-benjamin Feb 10, 2025 •

edited

Loading

EdSchouten Feb 4, 2025

moroten Feb 4, 2025

meroton-benjamin Feb 10, 2025

Implement sharding based on Rendezvous Hashing #238

Are you sure you want to change the base?

Implement sharding based on Rendezvous Hashing #238

Conversation

meroton-benjamin commented Feb 3, 2025

EdSchouten Feb 4, 2025

Choose a reason for hiding this comment

meroton-benjamin Feb 10, 2025 • edited Loading

Choose a reason for hiding this comment

EdSchouten Feb 4, 2025

Choose a reason for hiding this comment

moroten Feb 4, 2025

Choose a reason for hiding this comment

meroton-benjamin Feb 10, 2025

Choose a reason for hiding this comment

meroton-benjamin Feb 10, 2025 •

edited

Loading