Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TASK][EASY] Kyuubi Server HA&ZK get server from serverHosts support more strategy #6034

Closed
3 of 4 tasks
davidyuan1223 opened this issue Jan 31, 2024 · 7 comments
Closed
3 of 4 tasks

Comments

@davidyuan1223
Copy link
Contributor

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

What would you like to be improved?

The current Kyuubi HA mode, which retrieves servers from ZooKeeper, only supports the random strategy. This may lead to an overload on certain nodes. Therefore, in order to address the overload issue, it is necessary to support more strategies.

How should we improve?

Update Kyuubi Hive JDBC to support ZooKeeperClientHelper to support more strategies, currently, there are two strategy:

  1. Random
  2. Polling

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
  • No. I cannot submit a PR at this time.
@davidyuan1223 davidyuan1223 changed the title [Improvement] Kyuubi Server HA&ZK get server from serverHost support more strategy [Improvement] Kyuubi Server HA&ZK get server from serverHosts support more strategy Jan 31, 2024
@davidyuan1223
Copy link
Contributor Author

@wForget @pan3793

@pan3793
Copy link
Member

pan3793 commented Jan 31, 2024

SGTM, and it's better to extract an Interface to allow user to implement their custom strategy

@davidyuan1223
Copy link
Contributor Author

SGTM, and it's better to extract an Interface to allow user to implement their custom strategy

hello, want ask a question, kyuubi-hive-jdbc is only a driver engine, which cannot read configuration from the kyuubiConf, so if we set a strategy configEntry in HA, the kyuubi-hive-jdbc also cannot read the config, the only way i think is to add in connection params, but if add in connection params, we cannot custom our strategy, what do you think?

@sunnyzhuzhu
Copy link

@davidyuan1223 hello, May I ask what strategy will you implement?

@davidyuan1223
Copy link
Contributor Author

@davidyuan1223 hello, May I ask what strategy will you implement?

sorry, forget response, curruntly, i implemented poll and random, because hive-jdbc module is a single module, we can not use kyuubi-ha module, so, if we want implemented more strategies, we only can add strategy in connection params, like '&zkStartegy=poll/random', if you have more useful starategy, you can give me some advice

@davidyuan1223
Copy link
Contributor Author

@davidyuan1223 hello, May I ask what strategy will you implement?

hello, this is the demo command
bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=poll?spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n hadoop --verbose=true --showNestedErrs=true
currently it can use poll strategy to choose the right server, but there are some bugs, so i'm not commit a pr.

@davidyuan1223
Copy link
Contributor Author

@davidyuan1223 hello, May I ask what strategy will you implement?

hello, this is the demo command bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=poll?spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n hadoop --verbose=true --showNestedErrs=true currently it can use poll strategy to choose the right server, but there are some bugs, so i'm not commit a pr.

i plan the user could implemented a interface named org.apache.kyuubi.jdbc.hive.strategy.ChooseServerStrategy, then use zooKeeperStrategy=xxx.xxx.xxx, so user can use themselves implement plan, of course, if you have more effective plan, you can offered me, and i will try to implement them

@cxzl25 cxzl25 changed the title [Improvement] Kyuubi Server HA&ZK get server from serverHosts support more strategy [TASK][EASY][Improvement] Kyuubi Server HA&ZK get server from serverHosts support more strategy Apr 15, 2024
@cxzl25 cxzl25 changed the title [TASK][EASY][Improvement] Kyuubi Server HA&ZK get server from serverHosts support more strategy [TASK][EASY] Kyuubi Server HA&ZK get server from serverHosts support more strategy Apr 18, 2024
bowenliang123 added a commit that referenced this issue Oct 23, 2024
…t more strategy

# 🔍 Description
## Issue References 🔗

This pull request fixes #6034

## Describe Your Solution 🔧
Currently, use beeline to connect kyuubiServer with HA mode, the strategy only support random, this will lead to a high load on the machine. So i make this pr to support choose strategy.
[description]
First, we need know, beeline connect kyuubiServer dependency on kyuubi-hive-jdbc, it is isolated from the kyuubi cluster, so the code only support random choose serverHost from zk node /${namespace}. Because kyuubi-hive-jdbc is a stateless module, only run once, cannot store var about get serverHost from zk node.
[Solution]
This pr, we could implement a interface named ChooseServerStrategy to choose serverHost. I implement two strategy
1. poll: it will create a zk node named ${namespace}-counter, when a beeline client want connect kyuubiServer, the node will increment 1, use this value to take the remainder from serverHosts, like counter % serverHost.size, so we could get a order serverHost
2. random: random get serverHost from serverHosts
3. User Definied Class: implemented the ChooseServerStrategy, then put the jar to beeline-jars, it can use your strategy to choose serverHost

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪
Test the Strategy in my test Cluster
#### Behavior Without This Pull Request ⚰️
![image](https://github.com/apache/kyuubi/assets/51512358/d65b14c1-1b02-4436-8843-27b2e55d27ce)
![image](https://github.com/apache/kyuubi/assets/51512358/0524a30c-c2c3-464e-8453-84f3f1a74fb1)
![image](https://github.com/apache/kyuubi/assets/51512358/12feb93e-b743-4a43-821d-454f3c1af336)

#### Behavior With This Pull Request 🎉

[Use Case]
1. poll: `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=poll?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true`
2. random: `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=random?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true` or `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true`
3. YourStrategy: `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=xxx.xxx.xxx.XxxChooseServerStrategy?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true`

[Result: The Cluster have two Server (221,233)]
1. poll:
1.1. zkNode: counterValue
![image](https://github.com/apache/kyuubi/assets/51512358/5cbd15f9-bba4-4b23-bbfb-d61ed46f931f)

1.2. result:
![image](https://github.com/apache/kyuubi/assets/51512358/5a867167-8b06-49ed-aa44-b70726f3ae97)
![image](https://github.com/apache/kyuubi/assets/51512358/404b05e8-c828-458c-a9c4-97a323bf6ce7)
![image](https://github.com/apache/kyuubi/assets/51512358/3182e92b-6976-4931-a899-5e0d89cd2ac2)
![image](https://github.com/apache/kyuubi/assets/51512358/a55450ff-49cf-4b4a-9b90-91dd02982aa5)

2. random:
![image](https://github.com/apache/kyuubi/assets/51512358/d65b14c1-1b02-4436-8843-27b2e55d27ce)
![image](https://github.com/apache/kyuubi/assets/51512358/0524a30c-c2c3-464e-8453-84f3f1a74fb1)
![image](https://github.com/apache/kyuubi/assets/51512358/12feb93e-b743-4a43-821d-454f3c1af336)

3. YourStrategy(the test case only get the first serverHost):
![image](https://github.com/apache/kyuubi/assets/51512358/2e6395c2-6496-4516-9cf6-90abc921de7f)
![image](https://github.com/apache/kyuubi/assets/51512358/72975513-48d2-4f41-8a95-95cde0302c5b)
![image](https://github.com/apache/kyuubi/assets/51512358/487951fd-de45-4e1c-861a-94e0e5564e37)

#### Related Unit Tests

There is no Unit Tests.
---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6213 from davidyuan1223/ha_zk_support_more_strategy.

Closes #6034

961d3e9 [Bowen Liang] rename ServerStrategyFactory to ServerSelectStrategyFactory
353f940 [Bowen Liang] repeat
8822ad4 [Bowen Liang] repeat
6193394 [Bowen Liang] nit
e94f9e9 [Bowen Liang] nit
40f427a [Bowen Liang] rename StrategyFactory to StrategyFactoryServerStrategyFactory
7668f99 [Bowen Liang] test name
e194ea6 [Bowen Liang] remove ZooKeeperHiveClientException from method signature of chooseServer
265965e [Bowen Liang] polling
b39c567 [Bowen Liang] style
1ab79b4 [Bowen Liang] strategyName
8f8ca28 [Bowen Liang] nit
228bf10 [Bowen Liang] rename parameter zooKeeperStrategy to serverSelectStrategy
125c823 [Bowen Liang] rename ChooseServerStrategy to ServerSelectStrategy
b4aeb3d [Bowen Liang] repeat testing on pollingChooseStrategy
4655480 [davidyuan] update
09a84f1 [david yuan] remove the distirbuted lock
93f4a26 [davidyuan] remove reset
7b0c1b8 [davidyuan] fix var not valid and counter getAndIncrement
c95382a [davidyuan] fix var not valid and counter getAndIncrement
9ed2cac [david yuan] remove test comment
8eddd76 [davidyuan] Add Strategy Unit Test Case and fix the polling strategy counter begin with 0
73952f8 [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy
97b9597 [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy
ee5a9ad [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy
6a04453 [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy
1892f14 [davidyuan] add common method to get session level config
7c0c605 [yuanfuyuan] fix_4186

Lead-authored-by: Bowen Liang <[email protected]>
Co-authored-by: davidyuan <[email protected]>
Co-authored-by: davidyuan <[email protected]>
Co-authored-by: david yuan <[email protected]>
Co-authored-by: yuanfuyuan <[email protected]>
Signed-off-by: Bowen Liang <[email protected]>
(cherry picked from commit 8862767)
Signed-off-by: Bowen Liang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants