-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TASK][EASY] Kyuubi Server HA&ZK get server from serverHosts support more strategy #6034
Comments
SGTM, and it's better to extract an Interface to allow user to implement their custom strategy |
hello, want ask a question, kyuubi-hive-jdbc is only a driver engine, which cannot read configuration from the kyuubiConf, so if we set a strategy configEntry in HA, the kyuubi-hive-jdbc also cannot read the config, the only way i think is to add in connection params, but if add in connection params, we cannot custom our strategy, what do you think? |
@davidyuan1223 hello, May I ask what strategy will you implement? |
sorry, forget response, curruntly, i implemented poll and random, because hive-jdbc module is a single module, we can not use kyuubi-ha module, so, if we want implemented more strategies, we only can add strategy in connection params, like '&zkStartegy=poll/random', if you have more useful starategy, you can give me some advice |
hello, this is the demo command |
i plan the user could implemented a interface named org.apache.kyuubi.jdbc.hive.strategy.ChooseServerStrategy, then use zooKeeperStrategy=xxx.xxx.xxx, so user can use themselves implement plan, of course, if you have more effective plan, you can offered me, and i will try to implement them |
…t more strategy # 🔍 Description ## Issue References 🔗 This pull request fixes #6034 ## Describe Your Solution 🔧 Currently, use beeline to connect kyuubiServer with HA mode, the strategy only support random, this will lead to a high load on the machine. So i make this pr to support choose strategy. [description] First, we need know, beeline connect kyuubiServer dependency on kyuubi-hive-jdbc, it is isolated from the kyuubi cluster, so the code only support random choose serverHost from zk node /${namespace}. Because kyuubi-hive-jdbc is a stateless module, only run once, cannot store var about get serverHost from zk node. [Solution] This pr, we could implement a interface named ChooseServerStrategy to choose serverHost. I implement two strategy 1. poll: it will create a zk node named ${namespace}-counter, when a beeline client want connect kyuubiServer, the node will increment 1, use this value to take the remainder from serverHosts, like counter % serverHost.size, so we could get a order serverHost 2. random: random get serverHost from serverHosts 3. User Definied Class: implemented the ChooseServerStrategy, then put the jar to beeline-jars, it can use your strategy to choose serverHost ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Test the Strategy in my test Cluster #### Behavior Without This Pull Request ⚰️ ![image](https://github.com/apache/kyuubi/assets/51512358/d65b14c1-1b02-4436-8843-27b2e55d27ce) ![image](https://github.com/apache/kyuubi/assets/51512358/0524a30c-c2c3-464e-8453-84f3f1a74fb1) ![image](https://github.com/apache/kyuubi/assets/51512358/12feb93e-b743-4a43-821d-454f3c1af336) #### Behavior With This Pull Request 🎉 [Use Case] 1. poll: `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=poll?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true` 2. random: `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=random?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true` or `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true` 3. YourStrategy: `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=xxx.xxx.xxx.XxxChooseServerStrategy?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true` [Result: The Cluster have two Server (221,233)] 1. poll: 1.1. zkNode: counterValue ![image](https://github.com/apache/kyuubi/assets/51512358/5cbd15f9-bba4-4b23-bbfb-d61ed46f931f) 1.2. result: ![image](https://github.com/apache/kyuubi/assets/51512358/5a867167-8b06-49ed-aa44-b70726f3ae97) ![image](https://github.com/apache/kyuubi/assets/51512358/404b05e8-c828-458c-a9c4-97a323bf6ce7) ![image](https://github.com/apache/kyuubi/assets/51512358/3182e92b-6976-4931-a899-5e0d89cd2ac2) ![image](https://github.com/apache/kyuubi/assets/51512358/a55450ff-49cf-4b4a-9b90-91dd02982aa5) 2. random: ![image](https://github.com/apache/kyuubi/assets/51512358/d65b14c1-1b02-4436-8843-27b2e55d27ce) ![image](https://github.com/apache/kyuubi/assets/51512358/0524a30c-c2c3-464e-8453-84f3f1a74fb1) ![image](https://github.com/apache/kyuubi/assets/51512358/12feb93e-b743-4a43-821d-454f3c1af336) 3. YourStrategy(the test case only get the first serverHost): ![image](https://github.com/apache/kyuubi/assets/51512358/2e6395c2-6496-4516-9cf6-90abc921de7f) ![image](https://github.com/apache/kyuubi/assets/51512358/72975513-48d2-4f41-8a95-95cde0302c5b) ![image](https://github.com/apache/kyuubi/assets/51512358/487951fd-de45-4e1c-861a-94e0e5564e37) #### Related Unit Tests There is no Unit Tests. --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6213 from davidyuan1223/ha_zk_support_more_strategy. Closes #6034 961d3e9 [Bowen Liang] rename ServerStrategyFactory to ServerSelectStrategyFactory 353f940 [Bowen Liang] repeat 8822ad4 [Bowen Liang] repeat 6193394 [Bowen Liang] nit e94f9e9 [Bowen Liang] nit 40f427a [Bowen Liang] rename StrategyFactory to StrategyFactoryServerStrategyFactory 7668f99 [Bowen Liang] test name e194ea6 [Bowen Liang] remove ZooKeeperHiveClientException from method signature of chooseServer 265965e [Bowen Liang] polling b39c567 [Bowen Liang] style 1ab79b4 [Bowen Liang] strategyName 8f8ca28 [Bowen Liang] nit 228bf10 [Bowen Liang] rename parameter zooKeeperStrategy to serverSelectStrategy 125c823 [Bowen Liang] rename ChooseServerStrategy to ServerSelectStrategy b4aeb3d [Bowen Liang] repeat testing on pollingChooseStrategy 4655480 [davidyuan] update 09a84f1 [david yuan] remove the distirbuted lock 93f4a26 [davidyuan] remove reset 7b0c1b8 [davidyuan] fix var not valid and counter getAndIncrement c95382a [davidyuan] fix var not valid and counter getAndIncrement 9ed2cac [david yuan] remove test comment 8eddd76 [davidyuan] Add Strategy Unit Test Case and fix the polling strategy counter begin with 0 73952f8 [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy 97b9597 [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy ee5a9ad [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy 6a04453 [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy 1892f14 [davidyuan] add common method to get session level config 7c0c605 [yuanfuyuan] fix_4186 Lead-authored-by: Bowen Liang <[email protected]> Co-authored-by: davidyuan <[email protected]> Co-authored-by: davidyuan <[email protected]> Co-authored-by: david yuan <[email protected]> Co-authored-by: yuanfuyuan <[email protected]> Signed-off-by: Bowen Liang <[email protected]> (cherry picked from commit 8862767) Signed-off-by: Bowen Liang <[email protected]>
Code of Conduct
Search before asking
What would you like to be improved?
The current Kyuubi HA mode, which retrieves servers from ZooKeeper, only supports the random strategy. This may lead to an overload on certain nodes. Therefore, in order to address the overload issue, it is necessary to support more strategies.
How should we improve?
Update Kyuubi Hive JDBC to support ZooKeeperClientHelper to support more strategies, currently, there are two strategy:
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: