Skip to content

Commit

Permalink
Fix the issue that ShardingSphere cannot connect to HiveServer2 using…
Browse files Browse the repository at this point in the history
… remote Hive Metastore Server
  • Loading branch information
linghengqian committed Nov 29, 2024
1 parent 4e5b0ce commit e7b5841
Show file tree
Hide file tree
Showing 18 changed files with 480 additions and 43 deletions.
1 change: 1 addition & 0 deletions RELEASE-NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
1. Encrypt: Fix merge exception without encrypt rule in database - [#33708](https://github.com/apache/shardingsphere/pull/33708)
1. SQL Parser: Fix mysql parse zone unreserved keyword error - [#33720](https://github.com/apache/shardingsphere/pull/33720)
1. Proxy: Fix BatchUpdateException when execute INSERT INTO ON DUPLICATE KEY UPDATE in proxy adapter - [#33796](https://github.com/apache/shardingsphere/pull/33796)
1. Infra: Fix the issue that ShardingSphere cannot connect to HiveServer2 using remote Hive Metastore Server - [#33837](https://github.com/apache/shardingsphere/pull/33837)

### Change Logs

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,17 @@ ShardingSphere 对 HiveServer2 JDBC Driver 的支持位于可选模块中。
<artifactId>hive-service</artifactId>
<version>4.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
```

Expand Down Expand Up @@ -81,6 +92,17 @@ ShardingSphere 对 HiveServer2 JDBC Driver 的支持位于可选模块中。
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
```

Expand Down Expand Up @@ -427,8 +449,31 @@ ShardingSphere 仅针对 HiveServer2 `4.0.1` 进行集成测试。
### Hadoop 限制

用户仅可使用 Hadoop `3.3.6` 来作为 HiveServer2 JDBC Driver `4.0.1` 的底层 Hadoop 依赖。
HiveServer2 JDBC Driver `4.0.1` 不支持 Hadoop `3.4.1`,
参考 https://github.com/apache/hive/pull/5500 。
HiveServer2 JDBC Driver `4.0.1` 不支持 Hadoop `3.4.1`, 参考 https://github.com/apache/hive/pull/5500 。

对于 HiveServer2 JDBC Driver `org.apache.hive:hive-jdbc:4.0.1` 或 `classifier` 为 `standalone` 的 `org.apache.hive:hive-jdbc:4.0.1`,
实际上并不额外依赖 `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6`。

但 `org.apache.shardingsphere:shardingsphere-infra-database-hive` 的
`org.apache.shardingsphere.infra.database.hive.metadata.data.loader.HiveMetaDataLoader` 会使用 `org.apache.hadoop.hive.conf.HiveConf`,
这进一步使用了 `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6` 的 `org.apache.hadoop.mapred.JobConf` 类。

ShardingSphere 仅需要使用 `org.apache.hadoop.mapred.JobConf` 类,
因此排除 `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6` 的所有额外依赖是合理行为。

```xml
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
```

### SQL 限制

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,17 @@ The possible Maven dependencies are as follows.
<artifactId>hive-service</artifactId>
<version>4.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
```

Expand Down Expand Up @@ -83,6 +94,17 @@ The following is an example of a possible configuration,
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
```

Expand Down Expand Up @@ -433,8 +455,31 @@ Reference https://issues.apache.org/jira/browse/HIVE-28418.
### Hadoop Limitations

Users can only use Hadoop `3.3.6` as the underlying Hadoop dependency of HiveServer2 JDBC Driver `4.0.1`.
HiveServer2 JDBC Driver `4.0.1` does not support Hadoop `3.4.1`,
Reference https://github.com/apache/hive/pull/5500.
HiveServer2 JDBC Driver `4.0.1` does not support Hadoop `3.4.1`. Reference https://github.com/apache/hive/pull/5500 .

For HiveServer2 JDBC Driver `org.apache.hive:hive-jdbc:4.0.1` or `org.apache.hive:hive-jdbc:4.0.1` with `classifier` as `standalone`,
there is actually no additional dependency on `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6`.

But `org.apache.shardingsphere:shardingsphere-infra-database-hive`'s
`org.apache.shardingsphere.infra.database.hive.metadata.data.loader.HiveMetaDataLoader` uses `org.apache.hadoop.hive.conf.HiveConf`,
which further uses `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6`'s `org.apache.hadoop.mapred.JobConf` class.

ShardingSphere only needs to use the `org.apache.hadoop.mapred.JobConf` class,
so it is reasonable to exclude all additional dependencies of `org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6`.

```xml
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.3.6</version>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
```

### SQL Limitations

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[
{
"condition":{"typeReachable":"org.apache.hadoop.security.UserGroupInformation"},
"name":"org.apache.hadoop.security.UserGroupInformation$UgiMetrics",
"allDeclaredFields": true,
"allDeclaredMethods": true
}
]
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,6 @@
"includes":[{
"condition":{"typeReachable":"org.apache.hadoop.conf.Configuration"},
"pattern":"\\Qhadoop-site.xml\\E"
}, {
"condition":{"typeReachable":"org.apache.hadoop.conf.Configuration"},
"pattern":"\\Qcore-default.xml\\E"
}, {
"condition":{"typeReachable":"org.apache.hadoop.conf.Configuration"},
"pattern":"\\Qcore-site.xml\\E"
}]},
"bundles":[]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[
{
"condition":{"typeReachable":"org.apache.hadoop.hive.metastore.HiveMetaStoreClient"},
"name":"org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl",
"methods":[{"name":"<init>","parameterTypes":["org.apache.hadoop.conf.Configuration"] }]
},
{
"condition":{"typeReachable":"org.apache.hadoop.hive.metastore.conf.MetastoreConf"},
"name":"org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl"
}
]
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
"condition":{"typeReachable":"org.apache.shardingsphere.proxy.initializer.BootstrapInitializer"},
"interfaces":["java.sql.Connection"]
},
{
"condition":{"typeReachable":"org.apache.shardingsphere.infra.database.hive.metadata.data.loader.HiveMetaDataLoader"},
"interfaces":["org.apache.hadoop.metrics2.MetricsSystem$Callback"]
},
{
"condition":{"typeReachable":"org.apache.shardingsphere.driver.jdbc.core.datasource.ShardingSphereDataSource"},
"interfaces":["org.apache.hive.service.rpc.thrift.TCLIService$Iface"]
Expand Down
Loading

0 comments on commit e7b5841

Please sign in to comment.