Skip to content

Commit

Permalink
exclude iceberg from duckdb s3-compatibility test
Browse files Browse the repository at this point in the history
  • Loading branch information
jorritsandbrink committed Dec 1, 2024
1 parent 049c008 commit a0fc017
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 4 deletions.
5 changes: 3 additions & 2 deletions dlt/destinations/impl/filesystem/sql_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,8 +169,9 @@ def create_authentication(self, persistent: bool = False, secret_name: str = Non
# native google storage implementation is not supported..
elif self.fs_client.config.protocol in ["gs", "gcs"]:
logger.warn(
"For gs/gcs access via duckdb please use the gs/gcs s3 compatibility layer. Falling"
" back to fsspec."
"For gs/gcs access via duckdb please use the gs/gcs s3 compatibility layer if"
" possible (not supported when using `iceberg` table format). Falling back to"
" fsspec."
)
self._conn.register_filesystem(self.fs_client.fs_client)

Expand Down
7 changes: 6 additions & 1 deletion docs/website/docs/dlt-ecosystem/destinations/filesystem.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,8 @@ You need to create an S3 bucket and a user who can access that bucket. dlt does

#### Using S3 compatible storage

To use an S3 compatible storage other than AWS S3, such as [MinIO](https://min.io/) or [Cloudflare R2](https://www.cloudflare.com/en-ca/developer-platform/r2/), you may supply an `endpoint_url` in the config. This should be set along with AWS credentials:
To use an S3 compatible storage other than AWS S3, such as [MinIO](https://min.io/), [Cloudflare R2](https://www.cloudflare.com/en-ca/developer-platform/r2/) or [Google
Cloud Storage](https://cloud.google.com/storage/docs/interoperability), you may supply an `endpoint_url` in the config. This should be set along with AWS credentials:

```toml
[destination.filesystem]
Expand Down Expand Up @@ -732,6 +733,10 @@ Note that not all authentication methods are supported when using table formats
| [OAuth](../destinations/bigquery.md#oauth-20-authentication) |||
| [Application Default Credentials](bigquery.md#using-default-credentials) |||

:::note
The [S3-compatible](#using-s3-compatible-storage) interface for Google Cloud Storage is not supported when using `iceberg`.
:::

#### Iceberg Azure scheme
The `az` [scheme](#supported-schemes) is not supported when using the `iceberg` table format. Please use the `abfss` scheme. This is because `pyiceberg`, which `dlt` used under the hood, currently does not support `az`.

Expand Down
3 changes: 2 additions & 1 deletion tests/load/filesystem/test_sql_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -303,8 +303,9 @@ def test_table_formats(
# in case of gcs we use the s3 compat layer for reading
# for writing we still need to use the gc authentication, as delta_rs seems to use
# methods on the s3 interface that are not implemented by gcs
# s3 compat layer does not work with `iceberg` table format
access_pipeline = pipeline
if destination_config.bucket_url == GCS_BUCKET:
if destination_config.bucket_url == GCS_BUCKET and destination_config.table_format != "iceberg":
gcp_bucket = filesystem(
GCS_BUCKET.replace("gs://", "s3://"), destination_name="filesystem_s3_gcs_comp"
)
Expand Down

0 comments on commit a0fc017

Please sign in to comment.