Support Various Kinds of Consistent Hash #17817

zhezhidashi · 2023-07-21T03:54:47Z

What changes are proposed in this pull request?

Add Ketama Hashing, Jump Consistent Hashing, Maglev Hashing, and Multi Probe Hashing.

Why are the changes needed?

Now alluxio's user worker selection policy is Consistent Hash Policy. It bings too much time cost, and it is not enough uniform, and not strictly consistent.

Ketama: https://github.com/RJ/ketama
Jump Consistent Hashing: https://arxiv.org/pdf/1406.2294.pdf
Maglev Hashing: https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/44824.pdf
Multi Probe Hasing: https://arxiv.org/pdf/1505.00062.pdf

We strongly recommend using Maglev Hashing for User Worker Selection Policy. Under most situation, it has the minimum time cost, and it is the most uniform and balanced hashing policy.

Does this PR introduce any user facing changes?

alluxio.user.worker.selection.policy has the following values: CONSISTENT, JUMP, KETAMA, MAGLEV, MULTI_PROBE, LOCAL, REMOTE_ONLY, corresponding to consistent hash policy, maglev hash policy, ketama hash policy, maglev hash policy, multi-probe respectively hash policy, local worker policy, remote only policy.

The current default value is CONSISTENT.

We recommend using Maglev Hash, which has the best hash consistency and is the least time-consuming. That is to say, set the value of alluxio.user.worker.selection.policy to MAGLEV. We will also consider setting this as the default value in the future.

Ketama Hasing
alluxio.user.ketama.hash.replicas: This is the value of replicas in the ketama hashing algorithm. When workers changes, it will guarantee the hash table is changed only in a minimal. The value of replicas should be X times the physical nodes in the cluster, where X is a balance between efficiency and cost.

Jump Consistent Hashing
None.

Maglev Hashing
alluxio.user.maglev.hash.lookup.size: This is the size of the lookup table in the maglev hashing algorithm. It must be a prime number. In the maglev hashing, it will generate a lookup table for workers. The bigger the size of the lookup table, the smaller the variance of this hashing algorithm will be. But bigger look up table will consume more time and memory.

Multi Probe Hashing
alluxio.user.multi.probe.hash.probe.num: This is the number of probes in the multi-probe hashing algorithm. In the multi-probe hashing algorithm, the bigger the number of probes, the smaller the variance of this hashing algorithm will be. But more probes will consume more time and memory.

lucyge2022 · 2023-07-27T18:49:12Z

Hello! I was wondering two questions about this algorithm:

it seems the algorithm picks a index out of a total number of buckets instead of picking one specific location(worker), therefore the workers needs to strictly mapping to an integer to have the key constantly mapped to it, and when adding new workers, they have to map to the index incremented by one to the end of number of buckets is it?
So when a new member join or leave, we can know if or who the new bucket the key maps to, but it is impossible for the bucket, AKA the worker here to know what are the list of keys that got membership changed is it? Unless we recalculate them one by one?

JiamingMai · 2023-07-28T01:55:25Z

Hello! I was wondering two questions about this algorithm:

it seems the algorithm picks a index out of a total number of buckets instead of picking one specific location(worker), therefore the workers needs to strictly mapping to an integer to have the key constantly mapped to it, and when adding new workers, they have to map to the index incremented by one to the end of number of buckets is it?

So when a new member join or leave, we can know if or who the new bucket the key maps to, but it is impossible for the bucket, AKA the worker here to know what are the list of keys that got membership changed is it? Unless we recalculate them one by one?

@lucyge2022 You are right. Although jump consistent hash takes less time to calculate the result and data on different workers will be more balanced, the two points you mentioned are its fatal drawbacks.

JiamingMai

LGTM

JiamingMai · 2024-01-10T08:27:27Z

alluxio-bot, merge this please

Support Jump Consistent Hash

69f29e0

JiamingMai assigned zhezhidashi and JiamingMai Jul 24, 2023

JiamingMai self-requested a review July 24, 2023 10:06

JiamingMai added the type-feature This issue is a feature request label Jul 24, 2023

jiacheliu3 mentioned this pull request Jul 28, 2023

Make consistentHashing to a generic tools class #17718

Open

Merge branch 'Alluxio:main' into support-jump-hash

4b86a57

zhezhidashi changed the title ~~Support Jump Consistent Hash~~ Support Various Kinds of Consistent Hash Aug 22, 2023

Support Various Kinds of Consistent Hash

f1b4fab

Xenorith force-pushed the main branch from dfa74af to 5fb603d Compare September 11, 2023 01:35

apc999 force-pushed the main branch 2 times, most recently from 2fec0ec to b597c61 Compare October 17, 2023 19:11

Zihao Zhao added 3 commits December 25, 2023 10:10

Merge branch 'main' into support-jump-hash

278fa36

Make code for four hashing algorithms compatible with existing code

15fc548

Select a hashing algorithm using an enumeration class

5bf405a

JiamingMai approved these changes Jan 10, 2024

View reviewed changes

alluxio-bot merged commit b9de24c into Alluxio:main Jan 10, 2024
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Various Kinds of Consistent Hash #17817

Support Various Kinds of Consistent Hash #17817

zhezhidashi commented Jul 21, 2023 •

edited

Loading

lucyge2022 commented Jul 27, 2023

JiamingMai commented Jul 28, 2023

JiamingMai left a comment

JiamingMai commented Jan 10, 2024

Support Various Kinds of Consistent Hash #17817

Support Various Kinds of Consistent Hash #17817

Conversation

zhezhidashi commented Jul 21, 2023 • edited Loading

What changes are proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user facing changes?

lucyge2022 commented Jul 27, 2023

JiamingMai commented Jul 28, 2023

JiamingMai left a comment

Choose a reason for hiding this comment

JiamingMai commented Jan 10, 2024

zhezhidashi commented Jul 21, 2023 •

edited

Loading