Skip to content

Conversation

difin
Copy link
Contributor

@difin difin commented Sep 29, 2025

What changes were proposed in this pull request?

Implemented a new qtest-driverTestIcebergRESTCatalogGravitinoLlapLocalCliDriver for testing HiveRESTCatalogClient with Gravitino Iceberg Rest Server and a q-test for it.

Why are the changes needed?

To validate support for external RestCatalogs like Gravitino.

Does this PR introduce any user-facing change?

No

How was this patch tested?

New q-test.

@zhangbutao
Copy link
Contributor

Hi @difin, Can we use the standalone HMS instead Gravitino as the Iceberg REST server for E2E testing?

@deniskuzZ
Copy link
Member

deniskuzZ commented Sep 30, 2025

Hi @difin, Can we use the standalone HMS instead Gravitino as the Iceberg REST server for E2E testing?

@zhangbutao, we already have a test for HMS RestCatalog (iceberg_rest_catalog_hms.q).
This PR aims to validate support for external RestCatalogs like Gravitino (iceberg_rest_catalog_gravitino.q).

import java.util.List;

@RunWith(Parameterized.class)
public class TestIcebergRESTCatalogGravitinoLlapLocalCliDriver {
Copy link
Member

@deniskuzZ deniskuzZ Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to have a single TestIcebergRESTCatalogLlapLocalCliDriver with multiple RestCatalog providers, configured via additional param. Something similar to backend db arg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to do it with one CLI driver and extra parameter as you suggested, just currently there is a problem with syncing host warehouseDir with container warehouseDir, if I manage to solve it, I'll try to do it in one driver.

Copy link
Contributor Author

@difin difin Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @deniskuzZ, Gravitino test works now in CI and on local, but I don't think it is the best approach to unify the driver for HMS REST server and Gravitino for the following reasons:

  1. Both drivers are very different and their implementations share nothing in common.
  2. It will be one very complicated driver that combines 2 different approaches, not the best for readability.
  3. Having separate drivers is good as a ready to use clean working integration demo examples.
  4. .q files are slightly different: gravitino .q file has ! sleep command as a workaround to give the manual sync process to complete after INSERT and before reading the table.
  5. .q.out files are slightly different: HMS REST Catalog has the default database, Gravitino doesn't.

What do you think?

@difin difin force-pushed the rest_client_gravitino branch from c949a61 to f3abd30 Compare September 30, 2025 15:11
@difin difin force-pushed the rest_client_gravitino branch from f3abd30 to 0497faa Compare September 30, 2025 20:14
@difin difin force-pushed the rest_client_gravitino branch from 0497faa to b604394 Compare September 30, 2025 20:51
@difin difin changed the title HIVE-29233: Iceberg: HiveRESTCatalogClient test with Gravitino Iceberg Rest Server HIVE-29233: Iceberg: Validate HiveRESTCatalogClient test external RESTCatalogs like Gravitino Sep 30, 2025
@difin difin changed the title HIVE-29233: Iceberg: Validate HiveRESTCatalogClient test external RESTCatalogs like Gravitino HIVE-29233: Iceberg: Validate HiveRESTCatalogClient with external RESTCatalogs like Gravitino Sep 30, 2025
@difin difin force-pushed the rest_client_gravitino branch from b604394 to 8656b64 Compare September 30, 2025 20:58
@difin difin force-pushed the rest_client_gravitino branch from 8656b64 to 54bb487 Compare October 1, 2025 03:56
@okumin
Copy link
Contributor

okumin commented Oct 1, 2025

I checked out this branch and ran mvn test -Pitests -pl itests/qtest-iceberg -Dtest=TestIcebergRESTCatalogGravitinoLlapLocalCliDriver.java -Dtest.output.overwrite=false -Dqfile_regex=iceberg_rest_catalog_gravitino. I received 500 probably from Gravitino. I'm waiting for CI to conclude if it is my local issue or not

See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ for specific test cases logs.
 MetaException(message:Server error: null: {
"servlet":"org.glassfish.jersey.servlet.ServletContainer-324c64cd",
"message":"org.glassfish.jersey.server.ContainerException: java.lang.NoSuchMethodError: 'void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism(java.lang.String)'",
"url":"/iceberg/v1/config",
"status":"500"
})```

@difin
Copy link
Contributor Author

difin commented Oct 1, 2025

@okumin gravitino test passed on CI https://ci.hive.apache.org/job/hive-precommit/job/PR-6108/5/testReport/org.apache.hadoop.hive.cli/TestIcebergRESTCatalogGravitinoLlapLocalCliDriver/Testing___split_06___PostProcess___testCliDriver_iceberg_rest_catalog_gravitino_/

Passes on my local too:

cd itests; mvn test -Piceberg -Dtest=TestIcebergRESTCatalogGravitinoLlapLocalCliDriver -Dqfile=iceberg_rest_catalog_gravitino.q

@okumin
Copy link
Contributor

okumin commented Oct 2, 2025

My local machine has no luck.

docker run --rm -p 9001:9001 apache/gravitino-iceberg-rest:1.0.0

and then

curl 'http://localhost:9001/iceberg/v1/config' 
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 500 org.glassfish.jersey.server.ContainerException: java.lang.NoSuchMethodError: &apos;void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism(java.lang.String)&apos;</title>
</head>
<body><h2>HTTP ERROR 500 org.glassfish.jersey.server.ContainerException: java.lang.NoSuchMethodError: &apos;void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism(java.lang.String)&apos;</h2>
<table>
<tr><th>URI:</th><td>/iceberg/v1/config</td></tr>
<tr><th>STATUS:</th><td>500</td></tr>
<tr><th>MESSAGE:</th><td>org.glassfish.jersey.server.ContainerException: java.lang.NoSuchMethodError: &apos;void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism(java.lang.String)&apos;</td></tr>
<tr><th>SERVLET:</th><td>org.glassfish.jersey.servlet.ServletContainer-40f33492</td></tr>
</table>
<hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 9.4.51.v20230217</a><hr/>

</body>
</html>

It is likely an issue with my local or Gravitino's setup. I'm not sure what is different from the CI env

@@ -0,0 +1,86 @@
-- SORT_QUERY_RESULTS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified the diff is likely expected.

% diff iceberg/iceberg-handler/src/test/queries/positive/iceberg_rest_catalog_hms.q iceberg/iceberg-handler/src/test/queries/positive/iceberg_rest_catalog_gravitino.q
68a69,73
> --! In CI, Testcontainers' .withFileSystemBind() is not able to bind the same host path to the same container path,
> --! so as a workaround, the .metadata.json files from container are manually synced in a daemon process,
> --! since the sync can take some time, need to wait for it to happen after the insert operation.
> ! sleep 20;
>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #6108 (comment):

  • .q files are slightly different: gravitino .q file has ! sleep command as a workaround to give the manual sync process to complete after INSERT and before reading the table.
  • .q.out files are slightly different: HMS REST Catalog has the default database, Gravitino doesn't.

@@ -0,0 +1,231 @@
PREHOOK: query: create database ice_rest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified the diff is likely expected.

% diff iceberg/iceberg-handler/src/test/results/positive/llap/iceberg_rest_catalog_hms.q.out iceberg/iceberg-handler/src/test/results/positive/llap/iceberg_rest_catalog_gravitino.q.out
219d218
< default
233d231
< default

Copy link
Contributor Author

@difin difin Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.q.out files are slightly different: HMS REST Catalog has the default database, Gravitino doesn't.

<groupId>org.testcontainers</groupId>
<artifactId>testcontainers</artifactId>
<scope>test</scope>
</dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need these ones?

Copy link
Contributor Author

@difin difin Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the library that allows to run docker containers in tests - GenericContainer is from testcontainers:

  private void startGravitinoContainer() {
    gravitinoContainer = new GenericContainer<>(GRAVITINO_IMAGE)
        .withExposedPorts(9001)
        // Update entrypoint to create the warehouse directory before starting the server
        .withCreateContainerCmdModifier(cmd -> cmd.withEntrypoint("bash", "-c",
            String.format("mkdir -p %s && exec %s", warehouseDir.toString(), GRAVITINO_STARTUP_SCRIPT)))

* limitations under the License.
*/

package org.apache.hadoop.hive.cli;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


private static final CliAdapter adapter = new CliConfigs.TestIcebergRESTCatalogGravitinoLlapLocalCliDriver().getCliAdapter();
private static final DockerImageName GRAVITINO_IMAGE =
DockerImageName.parse("apache/gravitino-iceberg-rest:1.0.0-rc3");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use 1.0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Release v1.0.0 was released just 2 days ago, done.

gravitino.iceberg-rest.httpPort = 9001

# --- Iceberg REST Catalog Backend (set to JDBC) ---
gravitino.iceberg-rest.catalog-backend = jdbc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory mode does still not work?

Copy link
Contributor Author

@difin difin Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested in memory mode and confirmed it does work correctly now with the manual sync between the container and host warehouse directories.

That said, I’m not sure memory mode should be the preferred option here. One of the goals of this PR is to demonstrate how HMSRestCatalogClient integrates with an external RESTCatalog server like Gravitino. For that reason, I think it makes more sense to highlight jdbc mode in the examples/tests, since it better reflects real-world usage where the catalog state is persisted beyond the lifetime of a single container.

@difin
Copy link
Contributor Author

difin commented Oct 2, 2025

My local machine has no luck.

docker run --rm -p 9001:9001 apache/gravitino-iceberg-rest:1.0.0

and then

curl 'http://localhost:9001/iceberg/v1/config' 
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 500 org.glassfish.jersey.server.ContainerException: java.lang.NoSuchMethodError: &apos;void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism(java.lang.String)&apos;</title>
</head>
<body><h2>HTTP ERROR 500 org.glassfish.jersey.server.ContainerException: java.lang.NoSuchMethodError: &apos;void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism(java.lang.String)&apos;</h2>
<table>
<tr><th>URI:</th><td>/iceberg/v1/config</td></tr>
<tr><th>STATUS:</th><td>500</td></tr>
<tr><th>MESSAGE:</th><td>org.glassfish.jersey.server.ContainerException: java.lang.NoSuchMethodError: &apos;void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism(java.lang.String)&apos;</td></tr>
<tr><th>SERVLET:</th><td>org.glassfish.jersey.servlet.ServletContainer-40f33492</td></tr>
</table>
<hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 9.4.51.v20230217</a><hr/>

</body>
</html>

It is likely an issue with my local or Gravitino's setup. I'm not sure what is different from the CI env

The port 9001 is not the correct one to use on the host. This port is only open inside the gravitino container. It is mapped to a random external port in this place in the setup() method:

    String host = gravitinoContainer.getHost();
    Integer port = gravitinoContainer.getMappedPort(9001);
    String restCatalogPrefix = String.format("%s%s.", CatalogUtils.CATALOG_CONFIG_PREFIX, CATALOG_NAME);
    String restCatalogUri = String.format("http://%s:%d/iceberg", host, port);

gravitinoContainer.getMappedPort(9001) returns the mapped port for the 9001 port.
This is needed because 9001 port might be unavailable on the host, it is standard practice to use a random available port.

Copy link

sonarqubecloud bot commented Oct 2, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants