From 5020d9a5621a6e316c23b31ca172d09e9f55ec05 Mon Sep 17 00:00:00 2001 From: FANNG Date: Fri, 2 Aug 2024 22:58:58 +0800 Subject: [PATCH] [#4098] doc(iceberg-rest-server): update Iceberg REST server documents (#4113) ### What changes were proposed in this pull request? after split Iceberg REST service, add corresponding context. ### Why are the changes needed? Fix: #4098 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? just document --- docs/how-to-build.md | 8 ++ docs/how-to-install.md | 42 +++++++---- docs/iceberg-rest-service.md | 142 +++++++++++++++++++++-------------- 3 files changed, 123 insertions(+), 69 deletions(-) diff --git a/docs/how-to-build.md b/docs/how-to-build.md index 4a1f12b87d9..3bd51427071 100644 --- a/docs/how-to-build.md +++ b/docs/how-to-build.md @@ -165,6 +165,14 @@ license: "This software is licensed under the Apache License version 2." `gravitino-trino-connector-{version}.tar.gz.sha256` under the `distribution` directory. You can uncompress and deploy it to Trino to use the Gravitino Trino connector. +6. Assemble the Gravitino Iceberg REST server package + + ```shell + ./gradlew assembleIcebergRESTServer + ``` + + This creates `gravitino-iceberg-rest-server-{version}.tar.gz` and `gravitino-iceberg-rest-server-{version}.tar.gz.sha256` under the `distribution` directory. You can uncompress and deploy it to use the Gravitino Iceberg REST server. + ## How to Build Apache Gravitino on Windows (Using WSL) ### Download WSL (Ubuntu) diff --git a/docs/how-to-install.md b/docs/how-to-install.md index 5d9d8222805..66d655e8f94 100644 --- a/docs/how-to-install.md +++ b/docs/how-to-install.md @@ -12,6 +12,8 @@ Apache Gravitino supports running on Java 8, 11, and 17. Make sure you have Java `${JAVA_HOME}/bin/java -version` command. ::: +Gravitino package comprises both the Gravitino server and the Gravitino Iceberg REST server. You have the option to manage these servers independently or run them concurrently on a single server. + ### Get the Apache Gravitino binary distribution package Before installing Gravitino, make sure you have the Gravitino binary distribution package. You can @@ -30,20 +32,28 @@ The Gravitino binary distribution package contains the following files: ```text |── ... └── distribution/package - |── bin/gravitino.sh # Gravitino server Launching scripts. + |── bin/ + | ├── gravitino.sh # Gravitino server Launching scripts. + | └── gravitino-iceberg-rest-server.sh # Gravitino Iceberg REST server Launching scripts. |── catalogs - | └── hive/ # Hive catalog dependencies and configurations. - | └── lakehouse-iceberg/ # Apache Iceberg catalog dependencies and configurations. - | └── jdbc-mysql/ # JDBC MySQL catalog dependencies and configurations. - | └── jdbc-postgresql/ # JDBC PostgreSQL catalog dependencies and configurations. - |── conf/ # All configurations for Gravitino. - | ├── gravitino.conf # Gravitino server configuration. - | ├── gravitino-env.sh # Environment variables, etc., JAVA_HOME, GRAVITINO_HOME, and more. - | └── log4j2.properties # log4j configuration for the Gravitino server. - |── libs/ # Gravitino server dependencies libraries. - |── logs/ # Gravitino server logs. Automatically created after the Gravitino server starts. - |── data/ # Default directory for the Gravitino server to store data. - └── scripts/ # Extra scripts for Gravitino. + | └── hadoop/ # Apache Hadoop catalog dependencies and configurations. + | └── hive/ # Apache Hive catalog dependencies and configurations. + | └── jdbc-doris/ # JDBC doris catalog dependencies and configurations. + | └── jdbc-mysql/ # JDBC MySQL catalog dependencies and configurations. + | └── jdbc-postgresql/ # JDBC PostgreSQL catalog dependencies and configurations. + | └── kafka/ # Apache Kafka PostgreSQL catalog dependencies and configurations. + | └── lakehouse-iceberg/ # Apache Iceberg catalog dependencies and configurations. + | └── lakehouse-paimon/ # Apache Paimon catalog dependencies and configurations. + |── conf/ # All configurations for Gravitino. + | ├── gravitino.conf # Gravitino server and Gravitino Iceberg REST server configuration. + | ├── gravitino-iceberg-rest-server.conf # Gravitino server configuration. + | ├── gravitino-env.sh # Environment variables, etc., JAVA_HOME, GRAVITINO_HOME, and more. + | └── log4j2.properties # log4j configuration for the Gravitino server and Gravitino Iceberg REST server. + |── libs/ # Gravitino server dependencies libraries. + |── logs/ # Gravitino server and Gravitino Iceberg REST server logs. Automatically created after the server starts. + |── data/ # Default directory for the Gravitino server to store data. + |── iceberg-rest-server/ # Gravitino Iceberg REST server package and dependencies libraries. + └── scripts/ # Extra scripts for Gravitino. ``` #### Initialize the RDBMS (Optional) @@ -125,6 +135,12 @@ variable in the `conf/gravitino-env.sh` file. Then create a `Remote JVM Debug` configuration in `IntelliJ IDEA` and debug `gravitino.server.main`. ::: +#### Manage Gravitino Iceberg REST server in Gravitino package + +You can run the Iceberg REST server as either a standalone server or as an auxiliary service embedded in the Gravitino server. To start it as a standalone server, use the command `./bin/gravitino-iceberg-rest-server.sh start` with configurations specified in `./conf/gravitino-iceberg-rest-server.conf`. Alternatively, use `./bin/gravitino.sh start` to launch a Gravitino server that integrates both the Iceberg REST service and the Gravitino service, with all configurations centralized in `conf/gravitino.conf`. + +For more detailed information about the Gravitino Iceberg REST server, please refer to [Iceberg REST server document](./iceberg-rest-service.md). + ## Install Apache Gravitino using Docker ### Get the Apache Gravitino Docker image diff --git a/docs/iceberg-rest-service.md b/docs/iceberg-rest-service.md index c027bbd354f..47cd6e71951 100644 --- a/docs/iceberg-rest-service.md +++ b/docs/iceberg-rest-service.md @@ -12,62 +12,86 @@ The Apache Gravitino Iceberg REST Server follows the [Apache Iceberg REST API sp ### Capabilities -- Supports the Apache Iceberg REST API defined in Iceberg 1.3.1, and supports all namespace and table interfaces. `Token`, and `Config` interfaces aren't supported yet. +- Supports the Apache Iceberg REST API defined in Iceberg 1.5, and supports all namespace and table interfaces. The following interfaces are not implemented yet: + - token + - view + - multi table transaction + - pagination - Works as a catalog proxy, supporting `Hive` and `JDBC` as catalog backend. - Provides a pluggable metrics store interface to store and delete Iceberg metrics. -- When writing to HDFS, the Gravitino Iceberg REST catalog service can only operate as the specified HDFS user and - doesn't support proxying to other HDFS users. See [How to access Apache Hadoop](gravitino-server-config.md#how-to-access-apache-hadoop) for more details. +- Supports HDFS and S3 storage. :::info -Builds with Apache Iceberg `1.3.1`. The Apache Iceberg table format version is `1` by default. Builds with Hadoop 2.10.x. There may be compatibility issues when accessing Hadoop 3.x clusters. ::: -## Apache Gravitino Iceberg REST catalog service configuration +## Server management + +There are three deployment scenarios for Gravitino Iceberg REST server: +- A standalone server with a standalone Gravitino Iceberg REST server package. +- A standalone server in the Gravitino server package. +- An auxiliary service embedded in the Gravitino server. + +For detailed instructions on how to build and install the Gravitino server package, please refer to [How to build](./how-to-build.md) and [How to install](./how-to-install.md). To build the Gravitino Iceberg REST server package, use the command `./gradlew compileIcebergRESTServer -x test`. Alternatively, to create the corresponding compressed package in the distribution directory, use `./gradlew assembleIcebergRESTServer -x test`. The Gravitino Iceberg REST server package includes the following files: + +```text +|── ... +└── distribution/gravitino-iceberg-rest-server + |── bin/ + | └── gravitino-iceberg-rest-server.sh # Gravitino Iceberg REST server Launching scripts. + |── conf/ # All configurations for Gravitino Iceberg REST server. + | ├── gravitino-iceberg-rest-server.conf # Gravitino Iceberg REST server configuration. + | ├── gravitino-env.sh # Environment variables, etc., JAVA_HOME, GRAVITINO_HOME, and more. + | └── log4j2.properties # log4j configuration for the Gravitino Iceberg REST server. + | └── hdfs-site.xml & core-site.xml # HDFS configuration files. + |── libs/ # Gravitino Iceberg REST server dependencies libraries. + |── logs/ # Gravitino Iceberg REST server logs. Automatically created after the server starts. +``` -Assuming the Gravitino server is deployed in the `GRAVITINO_HOME` directory, you can locate the configuration options in [`$GRAVITINO_HOME/conf/gravitino.conf`](gravitino-server-config.md). There are four configuration properties for the Iceberg REST catalog service: +## Apache Gravitino Iceberg REST catalog server configuration -1. [**REST Catalog Server Configuration**](#rest-catalog-server-configuration): you can specify the HTTP server properties like host and port. +There are distinct configuration files for standalone and auxiliary server: `gravitino-iceberg-rest-server.conf` is used for the standalone server, while `gravitino.conf` is for the auxiliary server. Although the configuration files differ, the configuration items remain the same. -2. [**Gravitino Iceberg metrics store Configuration**](#iceberg-metrics-store-configuration): you could implement a custom Iceberg metrics store and set corresponding configuration. +Starting with version `0.6.0`, the prefix `gravitino.auxService.iceberg-rest.` for auxiliary server configurations has been deprecated. If both `gravitino.auxService.iceberg-rest.key` and `gravitino.iceberg-rest.key` are present, the latter will take precedence. The configurations listed below use the `gravitino.iceberg-rest.` prefix. -3. [**Gravitino Iceberg Catalog backend Configuration**](#gravitino-iceberg-catalog-backend-configuration): you have the option to set the specified catalog-backend to either `jdbc` or `hive`. +### Configuration to enable Iceberg REST service in Gravitino server. -4. [**Other Iceberg Catalog Properties Defined by Apache Iceberg**](#other-apache-iceberg-catalog-properties): allows you to configure additional properties defined by Apache Iceberg. +| Configuration item | Description | Default value | Required | Since Version | +|------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|---------------| +| `gravitino.auxService.names` | The auxiliary service name of the Gravitino Iceberg REST catalog service. Use **`iceberg-rest`**. | (none) | Yes | 0.2.0 | +| `gravitino.iceberg-rest.classpath` | The classpath of the Gravitino Iceberg REST catalog service; includes the directory containing jars and configuration. It supports both absolute and relative paths, for example, `iceberg-rest-server/libs, iceberg-rest-server/conf` | (none) | Yes | 0.2.0 | -Please refer to the following sections for details. +Please note that, it only takes affect in `gravitino.conf`, you don't need to specify the above configurations if start as a standalone server. ### REST catalog server configuration -| Configuration item | Description | Default value | Required | Since Version | -|-------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|----------|---------------| -| `gravitino.auxService.names` | The auxiliary service name of the Gravitino Iceberg REST catalog service. Use **`iceberg-rest`**. | (none) | Yes | 0.2.0 | -| `gravitino.auxService.iceberg-rest.classpath` | The classpath of the Gravitino Iceberg REST catalog service; includes the directory containing jars and configuration. It supports both absolute and relative paths, for example, `catalogs/lakehouse-iceberg/libs, catalogs/lakehouse-iceberg/conf` | (none) | Yes | 0.2.0 | -| `gravitino.auxService.iceberg-rest.host` | The host of the Gravitino Iceberg REST catalog service. | `0.0.0.0` | No | 0.2.0 | -| `gravitino.auxService.iceberg-rest.httpPort` | The port of the Gravitino Iceberg REST catalog service. | `9001` | No | 0.2.0 | -| `gravitino.auxService.iceberg-rest.minThreads` | The minimum number of threads in the thread pool used by the Jetty web server. `minThreads` is 8 if the value is less than 8. | `Math.max(Math.min(Runtime.getRuntime().availableProcessors() * 2, 100), 8)` | No | 0.2.0 | -| `gravitino.auxService.iceberg-rest.maxThreads` | The maximum number of threads in the thread pool used by the Jetty web server. `maxThreads` is 8 if the value is less than 8, and `maxThreads` must be greater than or equal to `minThreads`. | `Math.max(Runtime.getRuntime().availableProcessors() * 4, 400)` | No | 0.2.0 | -| `gravitino.auxService.iceberg-rest.threadPoolWorkQueueSize` | The size of the queue in the thread pool used by Gravitino Iceberg REST catalog service. | `100` | No | 0.2.0 | -| `gravitino.auxService.iceberg-rest.stopTimeout` | The amount of time in ms for the Gravitino Iceberg REST catalog service to stop gracefully. For more information, see `org.eclipse.jetty.server.Server#setStopTimeout`. | `30000` | No | 0.2.0 | -| `gravitino.auxService.iceberg-rest.idleTimeout` | The timeout in ms of idle connections. | `30000` | No | 0.2.0 | -| `gravitino.auxService.iceberg-rest.requestHeaderSize` | The maximum size of an HTTP request. | `131072` | No | 0.2.0 | -| `gravitino.auxService.iceberg-rest.responseHeaderSize` | The maximum size of an HTTP response. | `131072` | No | 0.2.0 | -| `gravitino.auxService.iceberg-rest.customFilters` | Comma-separated list of filter class names to apply to the APIs. | (none) | No | 0.4.0 | +| Configuration item | Description | Default value | Required | Since Version | +|--------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|----------|---------------| +| `gravitino.iceberg-rest.host` | The host of the Gravitino Iceberg REST catalog service. | `0.0.0.0` | No | 0.2.0 | +| `gravitino.iceberg-rest.httpPort` | The port of the Gravitino Iceberg REST catalog service. | `9001` | No | 0.2.0 | +| `gravitino.iceberg-rest.minThreads` | The minimum number of threads in the thread pool used by the Jetty web server. `minThreads` is 8 if the value is less than 8. | `Math.max(Math.min(Runtime.getRuntime().availableProcessors() * 2, 100), 8)` | No | 0.2.0 | +| `gravitino.iceberg-rest.maxThreads` | The maximum number of threads in the thread pool used by the Jetty web server. `maxThreads` is 8 if the value is less than 8, and `maxThreads` must be greater than or equal to `minThreads`. | `Math.max(Runtime.getRuntime().availableProcessors() * 4, 400)` | No | 0.2.0 | +| `gravitino.iceberg-rest.threadPoolWorkQueueSize` | The size of the queue in the thread pool used by Gravitino Iceberg REST catalog service. | `100` | No | 0.2.0 | +| `gravitino.iceberg-rest.stopTimeout` | The amount of time in ms for the Gravitino Iceberg REST catalog service to stop gracefully. For more information, see `org.eclipse.jetty.server.Server#setStopTimeout`. | `30000` | No | 0.2.0 | +| `gravitino.iceberg-rest.idleTimeout` | The timeout in ms of idle connections. | `30000` | No | 0.2.0 | +| `gravitino.iceberg-rest.requestHeaderSize` | The maximum size of an HTTP request. | `131072` | No | 0.2.0 | +| `gravitino.iceberg-rest.responseHeaderSize` | The maximum size of an HTTP response. | `131072` | No | 0.2.0 | +| `gravitino.iceberg-rest.customFilters` | Comma-separated list of filter class names to apply to the APIs. | (none) | No | 0.4.0 | The filter in `customFilters` should be a standard javax servlet filter. -You can also specify filter parameters by setting configuration entries in the style `gravitino.auxService.iceberg-rest..param.=`. +You can also specify filter parameters by setting configuration entries in the style `gravitino.iceberg-rest..param.=`. ### Apache Iceberg metrics store configuration Gravitino provides a pluggable metrics store interface to store and delete Iceberg metrics. You can develop a class that implements `org.apache.gravitino.catalog.lakehouse.iceberg.web.metrics` and add the corresponding jar file to the Iceberg REST service classpath directory. -| Configuration item | Description | Default value | Required | Since Version | -|------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|---------------| -| `gravitino.auxService.iceberg-rest.metricsStore` | The Iceberg metrics storage class name. | (none) | No | 0.4.0 | -| `gravitino.auxService.iceberg-rest.metricsStoreRetainDays` | The days to retain Iceberg metrics in store, the value not greater than 0 means retain forever. | -1 | No | 0.4.0 | -| `gravitino.auxService.iceberg-rest.metricsQueueCapacity` | The size of queue to store metrics temporally before storing to the persistent storage. Metrics will be dropped when queue is full. | 1000 | No | 0.4.0 | +| Configuration item | Description | Default value | Required | Since Version | +|-------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|---------------| +| `gravitino.iceberg-rest.metricsStore` | The Iceberg metrics storage class name. | (none) | No | 0.4.0 | +| `gravitino.iceberg-rest.metricsStoreRetainDays` | The days to retain Iceberg metrics in store, the value not greater than 0 means retain forever. | -1 | No | 0.4.0 | +| `gravitino.iceberg-rest.metricsQueueCapacity` | The size of queue to store metrics temporally before storing to the persistent storage. Metrics will be dropped when queue is full. | 1000 | No | 0.4.0 | ### Apache Gravitino Iceberg catalog backend configuration @@ -79,31 +103,31 @@ specify a Hive or JDBC catalog backend for production environment. #### Apache Hive backend configuration -| Configuration item | Description | Default value | Required | Since Version | -|----------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------|----------|---------------| -| `gravitino.auxService.iceberg-rest.catalog-backend` | The Catalog backend of the Gravitino Iceberg REST catalog service. Use the value **`hive`** for a Hive catalog. | `memory` | Yes | 0.2.0 | -| `gravitino.auxService.iceberg-rest.uri` | The Hive metadata address, such as `thrift://127.0.0.1:9083`. | (none) | Yes | 0.2.0 | -| `gravitino.auxService.iceberg-rest.warehouse` | The warehouse directory of the Hive catalog, such as `/user/hive/warehouse-hive/`. | (none) | Yes | 0.2.0 | -| `gravitino.auxService.iceberg-rest.catalog-backend-name` | The catalog backend name passed to underlying Iceberg catalog backend. Catalog name in JDBC backend is used to isolate namespace and tables. | `hive` for Hive backend, `jdbc` for JDBC backend, `memory` for memory backend | No | 0.5.2 | +| Configuration item | Description | Default value | Required | Since Version | +|-----------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------|----------|---------------| +| `gravitino.iceberg-rest.catalog-backend` | The Catalog backend of the Gravitino Iceberg REST catalog service. Use the value **`hive`** for a Hive catalog. | `memory` | Yes | 0.2.0 | +| `gravitino.iceberg-rest.uri` | The Hive metadata address, such as `thrift://127.0.0.1:9083`. | (none) | Yes | 0.2.0 | +| `gravitino.iceberg-rest.warehouse` | The warehouse directory of the Hive catalog, such as `/user/hive/warehouse-hive/`. | (none) | Yes | 0.2.0 | +| `gravitino.iceberg-rest.catalog-backend-name` | The catalog backend name passed to underlying Iceberg catalog backend. Catalog name in JDBC backend is used to isolate namespace and tables. | `hive` for Hive backend, `jdbc` for JDBC backend, `memory` for memory backend | No | 0.5.2 | #### JDBC backend configuration -| Configuration item | Description | Default value | Required | Since Version | -|----------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|-------------------------|----------|---------------| -| `gravitino.auxService.iceberg-rest.catalog-backend` | The Catalog backend of the Gravitino Iceberg REST catalog service. Use the value **`jdbc`** for a JDBC catalog. | `memory` | Yes | 0.2.0 | -| `gravitino.auxService.iceberg-rest.uri` | The JDBC connection address, such as `jdbc:postgresql://127.0.0.1:5432` for Postgres, or `jdbc:mysql://127.0.0.1:3306/` for mysql. | (none) | Yes | 0.2.0 | -| `gravitino.auxService.iceberg-rest.warehouse ` | The warehouse directory of JDBC catalog. Set the HDFS prefix if using HDFS, such as `hdfs://127.0.0.1:9000/user/hive/warehouse-jdbc` | (none) | Yes | 0.2.0 | -| `gravitino.auxService.iceberg-rest.catalog-backend-name` | The catalog name passed to underlying Iceberg catalog backend. Catalog name in JDBC backend is used to isolate namespace and tables. | `jdbc` for JDBC backend | No | 0.5.2 | -| `gravitino.auxService.iceberg-rest.jdbc.user` | The username of the JDBC connection. | (none) | Yes | 0.2.0 | -| `gravitino.auxService.iceberg-rest.jdbc.password` | The password of the JDBC connection. | (none) | Yes | 0.2.0 | -| `gravitino.auxService.iceberg-rest.jdbc-initialize` | Whether to initialize the meta tables when creating the JDBC catalog. | `true` | No | 0.2.0 | -| `gravitino.auxService.iceberg-rest.jdbc-driver` | `com.mysql.jdbc.Driver` or `com.mysql.cj.jdbc.Driver` for MySQL, `org.postgresql.Driver` for PostgreSQL. | (none) | Yes | 0.3.0 | +| Configuration item | Description | Default value | Required | Since Version | +|-----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|-------------------------|----------|---------------| +| `gravitino.iceberg-rest.catalog-backend` | The Catalog backend of the Gravitino Iceberg REST catalog service. Use the value **`jdbc`** for a JDBC catalog. | `memory` | Yes | 0.2.0 | +| `gravitino.iceberg-rest.uri` | The JDBC connection address, such as `jdbc:postgresql://127.0.0.1:5432` for Postgres, or `jdbc:mysql://127.0.0.1:3306/` for mysql. | (none) | Yes | 0.2.0 | +| `gravitino.iceberg-rest.warehouse ` | The warehouse directory of JDBC catalog. Set the HDFS prefix if using HDFS, such as `hdfs://127.0.0.1:9000/user/hive/warehouse-jdbc` | (none) | Yes | 0.2.0 | +| `gravitino.iceberg-rest.catalog-backend-name` | The catalog name passed to underlying Iceberg catalog backend. Catalog name in JDBC backend is used to isolate namespace and tables. | `jdbc` for JDBC backend | No | 0.5.2 | +| `gravitino.iceberg-rest.jdbc.user` | The username of the JDBC connection. | (none) | Yes | 0.2.0 | +| `gravitino.iceberg-rest.jdbc.password` | The password of the JDBC connection. | (none) | Yes | 0.2.0 | +| `gravitino.iceberg-rest.jdbc-initialize` | Whether to initialize the meta tables when creating the JDBC catalog. | `true` | No | 0.2.0 | +| `gravitino.iceberg-rest.jdbc-driver` | `com.mysql.jdbc.Driver` or `com.mysql.cj.jdbc.Driver` for MySQL, `org.postgresql.Driver` for PostgreSQL. | (none) | Yes | 0.3.0 | If you have a JDBC Iceberg catalog prior, you must set `catalog-backend-name` to keep consistent with your Jdbc Iceberg catalog name to operate the prior namespace and tables. :::caution -You must download the corresponding JDBC driver to the `catalogs/lakehouse-iceberg/libs` directory. +You must download the corresponding JDBC driver to the `iceberg-rest-server/libs` directory. ::: ### Other Apache Iceberg catalog properties @@ -111,9 +135,9 @@ You must download the corresponding JDBC driver to the `catalogs/lakehouse-icebe You can add other properties defined in [Iceberg catalog properties](https://iceberg.apache.org/docs/1.5.2/configuration/#catalog-properties). The `clients` property for example: -| Configuration item | Description | Default value | Required | -|---------------------------------------------|--------------------------------------|---------------|----------| -| `gravitino.auxService.iceberg-rest.clients` | The client pool size of the catalog. | `2` | No | +| Configuration item | Description | Default value | Required | +|----------------------------------|--------------------------------------|---------------|----------| +| `gravitino.iceberg-rest.clients` | The client pool size of the catalog. | `2` | No | :::info `catalog-impl` has no effect. @@ -139,14 +163,20 @@ Please set `gravitino.iceberg-rest.warehouse` to `s3://{bucket_name}/${prefix_na ### HDFS configuration -The Gravitino Iceberg REST catalog service adds the HDFS configuration files `core-site.xml` and `hdfs-site.xml` from the directory defined by `gravitino.auxService.iceberg-rest.classpath`, for example, `catalogs/lakehouse-iceberg/conf`, to the classpath. +You should place HDFS configuration file to the classpath of the Iceberg REST server, `iceberg-rest-server/conf` for Gravitino server package, `conf` for standalone Gravitino Iceberg REST server package. When writing to HDFS, the Gravitino Iceberg REST catalog service can only operate as the specified HDFS user and doesn't support proxying to other HDFS users. See [How to access Apache Hadoop](gravitino-server-config.md#how-to-access-apache-hadoop) for more details. + +## Starting the Iceberg REST server + +To start as an auxiliary service with Gravitino server: -## Starting the Apache Gravitino Iceberg REST catalog service +```shell +./bin/gravitino.sh start +``` -To start the service: +To start a standalone Gravitino Iceberg REST catalog server: ```shell -./bin/gravitino.sh start +./bin/gravitino-iceberg-rest-server.sh start ``` To verify whether the service has started: @@ -157,7 +187,7 @@ curl http://127.0.0.1:9001/iceberg/v1/config Normally you will see the output like `{"defaults":{},"overrides":{}}%`. -## Exploring the Apache Gravitino and Apache Iceberg REST catalog service with Apache Spark +## Exploring the Apache Gravitino Iceberg REST catalog service with Apache Spark ### Deploying Apache Spark with Apache Iceberg support