Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Umbrella] Fork beeline module and cut out Hive deps #6146

Closed
3 of 4 tasks
pan3793 opened this issue Mar 8, 2024 · 0 comments
Closed
3 of 4 tasks

[Umbrella] Fork beeline module and cut out Hive deps #6146

pan3793 opened this issue Mar 8, 2024 · 0 comments
Assignees
Labels
kind:umbrella This a umbrella ticket priority:major
Milestone

Comments

@pan3793
Copy link
Member

pan3793 commented Mar 8, 2024

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the proposal

Background

Currently, we extend org.apache.hive:hive-beeline:3.1.3 to implement Kyuubi-specific features, this way has certain drawbacks. Hive has horrible dependencies hell(and some involve CVEs), we have put a lot of effort previously into making the Hive deps manageable, while it still has a chance in corner cases that trigger class link issues

root@hadoop-master1:/opt/kyuubi# bin/beeline -u 'jdbc:hive2://hadoop-master1.orb.local:10009/'
Warn: Not find kyuubi environment file /etc/kyuubi/conf/kyuubi-env.sh, using default ones...
Connecting to jdbc:hive2://hadoop-master1.orb.local:10009/
Connected to: Spark SQL (version 3.4.2)
Driver: Kyuubi Project Hive JDBC Client (version 1.8.0)
Beeline version 1.8.0 by Apache Kyuubi
0: jdbc:hive2://hadoop-master1.orb.local:1000>
root@hadoop-master1:/opt/kyuubi# bin/beeline --version
Warn: Not find kyuubi environment file /etc/kyuubi/conf/kyuubi-env.sh, using default ones...
java.lang.NoClassDefFoundError: org/apache/curator/RetryPolicy

Another example is #5918

proposal

Fork beeline module from Apache Hive 3.1.3, and drop dependency org.apache.hive:hive-beeline:3.1.3, then gradually cut Hive transitive deps step by step, like Spark did in https://github.com/apache/spark/tree/master/sql/hive-thriftserver/src/main/java/org/apache/hive/service

To be clear, we should do our best to minimize the change on the forked code, to make it easy to backport patches from upstream in the future.

BTW, we already get benefits in the Kyuubi Hive JDBC module by forking the code.

Task list

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
  • No. I cannot submit a PR at this time.
@pan3793 pan3793 added kind:umbrella This a umbrella ticket priority:major labels Mar 8, 2024
pan3793 added a commit that referenced this issue Mar 11, 2024
# 🔍 Description
## Issue References 🔗

This is the first step of #6146, to gain a clear commit history, this PR just simply copied the `hive-beeline` module from Apache Hive 3.1.3, with minimal change to pass the tests and manually test basic functionalities, following PRs are going to remove other Hive deps gradually.

## Describe Your Solution 🔧

- Copy source code and test case from Apache Hive 3.1.3
- Drop `org.apache.hive:hive-beeline:3.1.3`
- Backport HIVE-21584 to support JDK 9+
- Drop `HiveCli`, `HiveSchemaTool` and `BeelineInPlaceUpdateStream`, and the corresponding test cases
- Temporary ignore(will fix later) `TestClientCommandHookFactory#testConnectHook` because of error `NoClassDefFound org/apache/curator/RetryPolicy`
- Tune testing code to pass UT

## Types of changes :bookmark:

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Minimal changes to pass the unit tests.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6109 from pan3793/fork-beeline.

Closes #6109

885f9fe [Cheng Pan] NOTICE
a2efa1c [Cheng Pan] fix
5bb1cc9 [Cheng Pan] Copy from Apache Hive 3.1.3

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
zhouyifan279 pushed a commit to zhouyifan279/kyuubi that referenced this issue Mar 11, 2024
# 🔍 Description
## Issue References 🔗

This is the first step of apache#6146, to gain a clear commit history, this PR just simply copied the `hive-beeline` module from Apache Hive 3.1.3, with minimal change to pass the tests and manually test basic functionalities, following PRs are going to remove other Hive deps gradually.

## Describe Your Solution 🔧

- Copy source code and test case from Apache Hive 3.1.3
- Drop `org.apache.hive:hive-beeline:3.1.3`
- Backport HIVE-21584 to support JDK 9+
- Drop `HiveCli`, `HiveSchemaTool` and `BeelineInPlaceUpdateStream`, and the corresponding test cases
- Temporary ignore(will fix later) `TestClientCommandHookFactory#testConnectHook` because of error `NoClassDefFound org/apache/curator/RetryPolicy`
- Tune testing code to pass UT

## Types of changes :bookmark:

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Minimal changes to pass the unit tests.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes apache#6109 from pan3793/fork-beeline.

Closes apache#6109

885f9fe [Cheng Pan] NOTICE
a2efa1c [Cheng Pan] fix
5bb1cc9 [Cheng Pan] Copy from Apache Hive 3.1.3

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
pan3793 added a commit that referenced this issue Mar 11, 2024
# 🔍 Description
## Issue References 🔗

This is the next step of #6146, cutting out most Hive deps(excepting `hive-common`) and recovering the skipped tests via minor code tunning.

## Describe Your Solution 🔧

- Drop `hive-jdbc`, `hive-service`, `hive-service-rpc` deps in the beeline module.
- Migrate from `commons-lang` to `commons-lang3` in the beeline module.
- Recover the skipped test `TestClientCommandHookFactory#connectHook`

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GA, and manually test to ensure the following error has gone.

Before
```
roothadoop-master1:/opt/kyuubi# bin/beeline --version
Warn: Not find kyuubi environment file /etc/kyuubi/conf/kyuubi-env.sh, using default ones...
java.lang.NoClassDefFoundError: org/apache/curator/RetryPolicy
```

After
```
roothadoop-master1:/opt/kyuubi# bin/beeline --version
Connecting to jdbc:hive2://hadoop-master1.orb.local:10000/default;password=hive;user=hive
Connected to: Apache Hive (version 2.3.9)
Driver: Kyuubi Project Hive JDBC Client (version 1.9.0-SNAPSHOT)
Beeline version 1.9.0-SNAPSHOT by Apache Kyuubi
0: jdbc:hive2://hadoop-master1.orb.local:1000>
```

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6153 from pan3793/beeline-2.

Closes #6153

8cd52e5 [Cheng Pan] notice
d03c729 [Cheng Pan] minor
5d16bf4 [Cheng Pan] beeline test pass

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
@pan3793 pan3793 added this to the v1.9.0 milestone Mar 12, 2024
@pan3793 pan3793 closed this as completed Mar 12, 2024
@pan3793 pan3793 self-assigned this Mar 12, 2024
zhaohehuhu pushed a commit to zhaohehuhu/incubator-kyuubi that referenced this issue Mar 21, 2024
# 🔍 Description
## Issue References 🔗

This is the first step of apache#6146, to gain a clear commit history, this PR just simply copied the `hive-beeline` module from Apache Hive 3.1.3, with minimal change to pass the tests and manually test basic functionalities, following PRs are going to remove other Hive deps gradually.

## Describe Your Solution 🔧

- Copy source code and test case from Apache Hive 3.1.3
- Drop `org.apache.hive:hive-beeline:3.1.3`
- Backport HIVE-21584 to support JDK 9+
- Drop `HiveCli`, `HiveSchemaTool` and `BeelineInPlaceUpdateStream`, and the corresponding test cases
- Temporary ignore(will fix later) `TestClientCommandHookFactory#testConnectHook` because of error `NoClassDefFound org/apache/curator/RetryPolicy`
- Tune testing code to pass UT

## Types of changes :bookmark:

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Minimal changes to pass the unit tests.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes apache#6109 from pan3793/fork-beeline.

Closes apache#6109

885f9fe [Cheng Pan] NOTICE
a2efa1c [Cheng Pan] fix
5bb1cc9 [Cheng Pan] Copy from Apache Hive 3.1.3

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
zhaohehuhu pushed a commit to zhaohehuhu/incubator-kyuubi that referenced this issue Mar 21, 2024
…ne module

# 🔍 Description
## Issue References 🔗

This is the next step of apache#6146, cutting out most Hive deps(excepting `hive-common`) and recovering the skipped tests via minor code tunning.

## Describe Your Solution 🔧

- Drop `hive-jdbc`, `hive-service`, `hive-service-rpc` deps in the beeline module.
- Migrate from `commons-lang` to `commons-lang3` in the beeline module.
- Recover the skipped test `TestClientCommandHookFactory#connectHook`

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GA, and manually test to ensure the following error has gone.

Before
```
roothadoop-master1:/opt/kyuubi# bin/beeline --version
Warn: Not find kyuubi environment file /etc/kyuubi/conf/kyuubi-env.sh, using default ones...
java.lang.NoClassDefFoundError: org/apache/curator/RetryPolicy
```

After
```
roothadoop-master1:/opt/kyuubi# bin/beeline --version
Connecting to jdbc:hive2://hadoop-master1.orb.local:10000/default;password=hive;user=hive
Connected to: Apache Hive (version 2.3.9)
Driver: Kyuubi Project Hive JDBC Client (version 1.9.0-SNAPSHOT)
Beeline version 1.9.0-SNAPSHOT by Apache Kyuubi
0: jdbc:hive2://hadoop-master1.orb.local:1000>
```

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes apache#6153 from pan3793/beeline-2.

Closes apache#6153

8cd52e5 [Cheng Pan] notice
d03c729 [Cheng Pan] minor
5d16bf4 [Cheng Pan] beeline test pass

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:umbrella This a umbrella ticket priority:major
Projects
None yet
Development

No branches or pull requests

1 participant