[FEATURE REQUEST] On-Premise S3 & Remote Signing #32

c-thiel · 2024-07-31T08:04:22Z

Is your feature request related to a problem? Please describe.
Currently Polaris only works for AWS S3. It would be great to get support for on-prem deployments as well!

Describe the solution you'd like
Add an additional storage profile similar to the AWS one which allows custom Endpoint configuration. Test with MinIO or Ceph. Grant access via remote signing.

Describe alternatives you've considered
I don't think there is any?

Additional context
Remote signing spec: https://github.com/apache/iceberg/blob/main/aws/src/main/resources/s3-signer-open-api.yaml

This change automatically downloads the `gradle-wrapper.jar` that matches the Gradle version mentioned in `gradle-wrapper.properties`, while ensuring the integrity of it. Future Gradle version bumps don't need to do anything wrt `gradle-wrapper.jar`.

…pache#46) * Squashed commit of the following: Co-authored-by: Evgeny Zubatov <[email protected]> Co-authored-by: Michael Collado <[email protected]> Co-authored-by: Shannon Chen <[email protected]> Co-authored-by: Eric Maynard <[email protected]> Co-authored-by: Alvin Chen <[email protected]> commit de0b4ee768a62221a480dce7da935a27a206d076 Merge: 1c19fc8 85e69a3 Author: Michael Collado <[email protected]> Date: Mon Jul 29 16:36:25 2024 -0700 Merge commit '3e6e01aae203356ed972502dfa596d04ec5a8ca5' into mcollado-merge-oss commit 1c19fc877231e34d5e8baa4a05902d13f6120050 Author: Michael Collado <[email protected]> Date: Mon Jul 29 16:25:05 2024 -0700 Merge polaris-dev OSS contributions commit a3fbf4ce4bc6c629bef308349b7c7a64c8335ac9 Author: Michael Collado <[email protected]> Date: Mon Jul 29 15:43:23 2024 -0700 Fix token refresh in oauth service to work with client credentials (apache#37) The Iceberg REST client _does_ retry refreshing the auth token with client credentials, but it submits them in Basic Auth form rather than as form parameters. We need to base64/url-decode them in order to validate the credentials are correct. We also need to return an accepted tokenType during refresh. Tested with ```java RESTSessionCatalog sessionCatalog = new RESTSessionCatalog(config -> HTTPClient.builder(config).uri(config.get(CatalogProperties.URI)).build(), null); sessionCatalog.initialize("demo", Map.of( "uri", "http://localhost:8181/api/catalog", "prefix", "catalog", "credential", "$URLENCODED_CLIENTID:$URLENCODED_CLIENTSECRET", "scope", "PRINCIPAL_ROLE:ALL", "token-refresh-enabled", "true", "warehouse", "martins_demo1" )); Field catalogAuth = RESTSessionCatalog.class.getDeclaredField("catalogAuth"); catalogAuth.setAccessible(true); OAuth2Util.AuthSession authSession = (OAuth2Util.AuthSession) catalogAuth.get(sessionCatalog); Field client = RESTSessionCatalog.class.getDeclaredField("client");; client.setAccessible(true); RESTClient restClient = (RESTClient) client.get(sessionCatalog); for (int i = 0; i < 10; i++) { System.out.println(authSession.refresh(restClient)); Thread.sleep(10000); } ``` commit 517cb6231d424fac59ceecb1845bdb0a3e065265 Author: Michael Collado <[email protected]> Date: Mon Jul 29 10:47:32 2024 -0700 Changed reg test docker image to stop exposing aws credentials as env variables (apache#36) In the reg tests, when S3 credentials aren't present in the FileIO, the S3 client is falling back to the credentials in the environment variables, which have access to everything. This caused a previous bug to go uncaught. I verified that if I don't update the FileIO for a table, these tests fail now. commit e418aefe8964c7c67b509f8eec43055f1c17a742 Author: Michael Collado <[email protected]> Date: Mon Jul 29 08:40:58 2024 -0700 Fix error with catalog FileIO credentials and path construction (apache#35) Namespace directories were being constructed backwards. Unsure why the tests didn't catch this FileIO for table creation and update was also not updating credentials correctly due to several `fileIO` variables in scope and updating the wrong one. I've renamed the variables to be more clear what each fileIO is scoped to. Future change upcoming to improve reg tests to catch issues - fileIO is falling back to environment variables in the docker image commit 7f37bb252cec7df07db3994bf981efe76a52639c Author: Michael Collado <[email protected]> Date: Mon Jul 29 02:05:14 2024 -0700 Fix validation to read table metadata file after fileio initialization with credentials (apache#34) TableMetadataParser was reading metadata file before FileIO was initialized with credentials. This was uncaught in the tests because the FileIO reads the test image's environment variables. commit 0866f3ad54295a6d7822b9645f59996986d23acd Author: Michael Collado <[email protected]> Date: Sun Jul 28 22:10:22 2024 -0700 Fixed issue when creating table under namespace with custom location (apache#33) Tables were still being created with the default directory structure when their parent namespace had a custom location. This fixes the issue and adds a test proving the table is successfully created and that its location is under the custom namespace directory commit ee701ff99120b948b5ed3120461a9aaf0842f944 Author: Michael Collado <[email protected]> Date: Sun Jul 28 20:53:52 2024 -0700 Disallow overlapping base locations for table and namespaces and prevent table location from escaping via metadata file (apache#32) Two major changes included here: * Disables table locations and namespace locations from overlapping. Since namespaces can't overlap and tables can't overlap sibling namespaces or sibling tables, this prevents all table locations from overlapping within a catalog * Prevents metadata files from pointing at table locations outside of the table's base location Both of these features are controllable by feature flags. Because of existing unit and integration tests (notably the ones we import from Iceberg), I also made the TableMetadata location check and the namespace directory checking configurable at the catalog level. However, the overlap checking within a namespace is not configurable at the catalog level (this means there's a gap so that if a catalog allows metadata files to point to locations outside of the namespace directory, a table's location could overlap with a table in another directory). It is possible for a table or a namespace to _set_ its base-directory to something other than the default. However, tables and namespaces with overridden base locations still cannot overlap with sibling tables or namespaces. commit 51ac320f60c103c2b10cd8d2910217010f38afdd Author: Shannon Chen <[email protected]> Date: Sun Jul 28 23:13:59 2024 +0000 The loadTable is current scoped to table location, this PR makes 1. loadTable only scoped to table location + `metadata/` and `data/`. 2. when refreshTable keep it only scoped to table location + `metadata` 3. Throw user error when the user specify `write.metadata.path` or `write.data.path` commit 838ba65849e7f4f0dd5dfcac0093262a165e52e4 Author: Eric Maynard <[email protected]> Date: Fri Jul 26 22:04:41 2024 -0700 CREATE_TABLE shouldn't return credentials to read the table if the user doesn't have that privilege (apache#29) Before CREATE_TABLE returns credentials that can be used to read a table, Polaris should validate that the user has TABLE_READ_DATA on that table. commit 96257f4fa54372fb565954024cbfe256de5d6f20 Author: Alvin Chen <[email protected]> Date: Fri Jul 26 15:43:35 2024 -0700 Call metric init on IcebergRestConfigurationApi and IcebergRestOAuth2Api class (apache#30) commit d6e057778811f20a462ad2746722e1a1427197cf Author: Alvin Chen <[email protected]> Date: Fri Jul 26 10:22:34 2024 -0700 Switch Metrics Emission from Ad-Hoc to Initialization-Based (apache#28)  Metrics are currently emitted in an ad-hoc fashion, meaning they will only be queriable in Prometheus if the corresponding API endpoint is invoked. This makes plotting difficult in Grafana, especially in the case where we need to aggregate across multiple metrics to reflect, for instance, the overall error rate across all endpoints in the application. Say we have metrics a, b, c. If b is null since the corresponding API has not yet been invoked, a+b+c would result in the value null, instead of the sum of a and c. This can be fixed in Grafana by "filling in" the metrics with a metric that is guaranteed to be present, say `up`. The promql query will then become: `(a or (up * 0)) + (b or (up * 0)) + (c or (up * 0))` However, the query becomes extremely large and slow. Thus to avoid this, we'll make sure the metrics are always emitted regardless of the corresponding API being called. We also add a counter metric to each endpoint to track the total number of invokes. Previously we had timers who have an attribute `count`. However, they are only incremented on successes (since we only record timer on success), therefore, they did not incorporate failures in the counts. commit 20bf59b9b5e8dba9a35b932f036114b85375829b Author: Eric Maynard <[email protected]> Date: Fri Jul 26 10:09:50 2024 -0700 When a catalog is created or updated, we should check that the catalog does not have a location which overlaps with the location of an existing catalog. * Squashed commit of the following: commit b65dbc68c43e7c3ff0e1901e516c9749fda58ced Author: Michael Collado <[email protected]> Date: Mon Jul 29 17:24:10 2024 -0700 Fix Gradle formatting error in merge * Update aws config to check for blank instead of null to address reg tests when aws keys are not available

mjf-89 · 2024-08-01T09:32:13Z

It would be great to know if such feature is already on the roadmap or not. I would be personally interested in contributing here because currently I'm working for a company where we have everything on-premise. We are using Iceberg with a legacy Hive Standalone Metastore and we are looking for a REST alternative. Polaris is really promising but the lack of support for on premise S3 provider will hinder its adoption.

chris922 · 2024-08-01T14:11:15Z

When I watched the demo on Snowflake site this was the first thing I noticed - where to configure the S3 endpoint etc.

I can also support here, maybe in development but also testing it with some S3 alternatives. I've got access to Dell ECS, NetApp StorageGRID, MinIO

guitcastro · 2024-08-01T18:09:48Z

The main point is that only few S3 compatible services have support for Security Token Service (STS). Minio does have support for it.

mjf-89 · 2024-08-03T21:14:03Z

@guitcastro It might be possible to implement remote signing for on prem S3 implementations other than minio. But that would mean implement also the remote signing open API spec and that is probably outside the scope of polaris?

guitcastro · 2024-08-05T14:11:56Z

@mjf-89 I don't know more than you. I am not maintainer, my comment is just based on how the S3 auth are implemented.

dimas-b · 2024-08-05T14:34:48Z

@mjf-89 : could you provide more details (perhaps a link) about the remote signing open API spec that you mentioned above?

snazy · 2024-08-05T15:01:31Z

The main point is that only few S3 compatible services have support for Security Token Service (STS). Minio does have support for it.

STS is needed for credential-vending, as currently implement.

The actual request signing doesn't interact w/ any remote service. The client (in this case Iceberg) asks the resource (Polaris) to return a signed URL for every particular S3 request.

mjf-89 · 2024-08-05T23:30:27Z

@dimas-b sure, in the openapi spec of the rest catalog you can see that there are currently two supported delegated access mechanisms, vended credentials and remote signing:

https://github.com/apache/iceberg/blob/e9364faabcc67eef6c61af2ecdf7bcf9a3fef602/open-api/rest-catalog-open-api.yaml#L1488

And here you can find the openapi spec for the remote signing service:

https://github.com/apache/iceberg/blob/e9364faabcc67eef6c61af2ecdf7bcf9a3fef602/aws/src/main/resources/s3-signer-open-api.yaml

I don't know of any open source implemention of that openapi spec, however I think that Tabular is based on such a thing. Or at least that is what I guessed reading their blog posts:

https://tabular.io/blog/securing-the-data-lake-part-1/

Where I have interpreted the "authorized file access request" as a presigned url that the remote signing service is giving back to the engine to access the data files.

c-thiel · 2024-08-07T07:14:59Z

I agree that STS is the better solution if available, but not all S3 Services support it. It would be nice to add it at a fallback for on-premise deployments.
@mjf-89 there are currently two open-source catalogs that support it, Project Nessie and the TIP Iceberg Catalog - links go roughly to the corresponding code sections.

mjf-89 · 2024-08-07T15:40:14Z

@c-thiel thank you very much, last time that I checked on Nessie the iceberg rest api was still not implemented and S3 remote signing was definitely not there, TIP was completely outside my radar but it seem really promising, especially for the customization freedom on the authz side. Happy to see that the landscape of iceberg rest catalog is evolving so rapidly.

As for Polaris I hope that remote signing can be implemented as a fallback for those S3 implementations that do not have an sts endpoint like you have said.

dimas-b · 2024-08-14T22:06:58Z

Remote Signing can be a useful feature. I'd support adding it to Polaris.

The catalog does not expose any long of mid-term credentials to the client (reduces risk of credential leaks and makes access revocation is immediate, if/when it happens).
Client session runtime is not limited by STS session restrictions (extremely long client sessions are possible at the expense of slightly slower storage I/O calls).
The catalog can (hypothetically) make finer-grained access decisions that are not expressible in terms of STS policies.

dimas-b · 2024-08-14T22:10:11Z

@mjf-89 :

I don't know of any open source implemention of that openapi spec

Just FYI: Nessie supports that.

mjf-89 · 2024-08-16T09:34:31Z

@dimas-b thank you, as @c-thiel already mentioned both Nessie and TIP actually support that feature.

One question that I have regards the performance implications of remote signing. I feel like it could introduce quite a bit of latency to the queries, of course mich depend on the implementation of both the runtime and the catalog.

Another thing to be noted is that not all the runtimes actually support such feature. As an example I think that currently trino lacks such support: trinodb/trino#21189

heartblast · 2024-10-17T07:12:34Z

Since the temporary credential feature provided by S3 differs from that of MinIO, adjustments are required to support MinIO. Specifically, the Polaris Catalog must obtain a security token from an OAuth service such as Keycloak to utilize MinIO's temporary credential feature, and the configuration must allow for STS (Security Token Service) endpoint settings to enable this integration.

mmgaggle · 2024-10-17T20:48:01Z

Ceph supports IAM/STS, both AssumeRole and AssumeRoleWithWebIdentity. I can test to make sure this works as expected in that context.

mmgaggle · 2024-10-17T21:01:30Z

@mjf-89 It would be more powerful if the engine sent the GetObject requests to the catalog, and the catalog signed them using its own credential, or using a credential is generates (ie you might want it to AssumeRole before signing for tracking purposes). If the catalog returned a pre-signed url for an object, then if an attacker were to snoop on the engine to catalog traffic they would have access to the object. A signed request has protection against replay attacks.

lefebsy · 2024-10-18T17:39:33Z

Since the temporary credential feature provided by S3 differs from that of MinIO, adjustments are required to support MinIO. Specifically, the Polaris Catalog must obtain a security token from an OAuth service such as Keycloak to utilize MinIO's temporary credential feature, and the configuration must allow for STS (Security Token Service) endpoint settings to enable this integration.

Hello,
In fact MinIO support STS assumeRole API out of the box. The documentation explain you can setup an external idp like keycloak, but it will work with MinIO alone. I have done a quick test after forking polaris.

lefebsy · 2024-10-18T18:02:36Z

Hello,

I can try to contribute to this feature request.
I have coded a polaris core storage implementation, copy of the aws, without the required arnRole parameter, replaced by endpoint, path style, and credentials.
You can choose a strategy for the client calling catalog :

"vending token" with default STS assume role like aws. It is working with MinIO, and should works also with Dell ECS, NetApp StorageGRID, etc...
"vending keys" :
if you do not care a lot about security the catalog can send his own keys to the client
you can also store in catalog keys dedicated to clients. You can then revoke them externally without breaking the catalog.

You can also choose a strategy during the creation of the catalog (cli, curl api) by passing catalog & or client keys by direct value, or environnement name variable.
In case of the variable name, you have to be sure that Polaris can find them inside is running environnement, of course. A kubernetes production deployment should be suited for this configuration (envFromSecrets deployed to Polaris pods)

Let me know your opinion about this design. The code is ready, I still have to read the "PR guide" and check requirements.

mmgaggle · 2024-10-29T15:30:12Z

@lefebsy If you can point me to a branch, I can try it against Ceph

lefebsy · 2024-11-05T18:29:52Z

@lefebsy If you can point me to a branch, I can try it against Ceph

Thank you ! I have no access to a rados gateway it could be a good test.
https://github.com/lefebsy/polaris/tree/refs/heads/s3compatible
The PR is under review #389

@mmgaggle If you need specific modification to be compliant with Ceph-rgw tell me :)
(Curiosity : Is it a Ceph supporting NVMe over Fabrics ? I've heard perfs are awsome...)

ang6300 · 2024-12-11T02:46:16Z

@lefebsy I tried with this
https://github.com/lefebsy/polaris/tree/refs/heads/s3compatible

I am testing it against on premise s3 storage but did not find what option to specify custom s3 and sts endpoint. Tried to add these to spark-default.conf but not working.

spark.hadoop.fs.s3a.endpoint
spark.hadoop.fs.s3a.assumed.role.sts.endpoint
spark.hadoop.fs.s3a.aws.credentials.provider
spark.hadoop.fs.s3a.assumed.role.credentials.provider
spark.hadoop.fs.s3a.access.key
spark.hadoop.fs.s3a.secret.key
spark.hadoop.fs.s3a.path.style.access

Further testing shows spark_sql loads AWS credential from environment variables. Set up all the variables and able to write metadata to on premise s3 bucket but failed to write the parquet file.
s3://polaris-sg/sg_db1/table1/data/00000-0-49bf02f5-8632-45b1-be33-95e1658e3a33-0-00001.parquet

24/12/11 23:35:52 ERROR DataWritingSparkTask: Aborting commit for partition 0 (task 0, attempt 0, stage 0.0)
24/12/11 23:35:53 WARN S3FileIO: Encountered failure when deleting batch
software.amazon.awssdk.services.s3.model.S3Exception: The AWS Access Key Id you provided does not exist in our records. (Service: S3, Status Code: 403, Request ID: P25DMBEE3XBK20XP, Extended Request ID: iXnMbkU52j2tRqGjmIARh2tttxdcNBl0RhlTrLSXapP/250/Mlms0dH2K0OMoawTldRdt37pHAfZWRSXoFm/fw==)

I believe it is expecting to get temporary access key/STS token from AWS STS endpoint using the assumerole ARN provided to run_spark_sql.sh.

It does not use this environment variable:
export AWS_ENDPOINT_URL_STS="https://sts.example.com"

Can you update your code to allow custom STS endpoint?

Appreciate any help. Thank you.

lefebsy · 2024-12-12T00:56:54Z

Hello @ang6300 ,

Parameters are described here :
https://github.com/lefebsy/polaris/blob/refs/heads/s3compatible/spec/polaris-management-service.yml#L906

You can have a look to the non-regression tests script to find example to create catalog and how to configure spark-sql :
https://github.com/lefebsy/polaris/blob/refs/heads/s3compatible/regtests/run_spark_sql_s3compatible.sh

You tried to setup S3 in spark confs, but it's useless : S3 endpoint and S3 keys are defined in the catalog inside Polaris.
Spark will retrieve the endpoint from Polaris catalog response, alongside the STS, during vended-credential negotiation with the catalog.

TL;DR :

Create the catalog :

curl -s -i -X POST -H "Authorization: Bearer ${POLARIS_BEARER_TOKEN}" \
      -H 'Accept: application/json' \
      -H 'Content-Type: application/json' \
      http://${POLARIS_HOST:-localhost}:8181/api/management/v1/catalogs \
      -d "{
            \"name\": \"my-minio-wh\",
            \"id\": 100,
            \"type\": \"INTERNAL\",
            \"readOnly\": false,
            \"properties\": {
              \"default-base-location\": \"${S3_LOCATION}\"
            },
            \"storageConfigInfo\": {
              \"storageType\": \"S3_COMPATIBLE\",
              \"allowedLocations\": [\"${S3_LOCATION}/\"],
              \"s3.endpoint\": \"https://localhost:9000\"
              \"s3.path-style-access\": true,
              \"s3.credentials.catalog.access-key-id\": \"S3_ACCESS_KEY_VAR_NAME\",
              \"s3.credentials.catalog.secret-access-key\": \"S3_SECRET_KEY_VAR_NAME\"
            }
          }"

Use the catalog :

${SPARK_HOME}/bin/spark-sql \
  --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
  --conf spark.sql.catalog.polaris=org.apache.iceberg.spark.SparkCatalog \
  --conf spark.sql.catalog.polaris.type=rest \
  --conf spark.sql.catalog.polaris.uri=http://${POLARIS_HOST:-localhost}:8181/api/catalog \
  --conf spark.sql.catalog.polaris.header.X-Iceberg-Access-Delegation=vended-credentials \
  --conf spark.sql.catalog.polaris.token="${POLARIS_BEARER_TOKEN}" \
  --conf spark.sql.catalog.polaris.warehouse=my-minio-wh \
  --conf spark.sql.defaultCatalog=polaris

Et voilà :)

ang6300 · 2024-12-12T01:26:09Z

Hello @lefebsy

Thank you for all the details.
If I read it correctly, the run_spark_sql_s3compatible.sh does not use role arn.

Is it possible to use role-arn but with custom sts endpoint

[trino-user@trino-m regtests]$ grep role run_spark_sql.sh

./run_spark_sql.sh [S3-location AWS-IAM-role]

- [AWS-IAM-role] - The AWS IAM role for catalog to assume when accessing the S3 location.

./run_spark_sql.sh s3://my-bucket/path arn:aws:iam::123456789001:role/my-role

echo "Usage: ./run_spark_sql.sh [S3-location AWS-IAM-role]"

Currently I am able to use run_spark_sql.sh with S3-location AWS-IAM-role with on premise s3 compatible storage.
It creates the metadata object but not the parquet.

aws s3 ls --recursive s3://polaris-sg/sg_db1
2024-12-11 23:34:03 1072 sg_db1/table1/metadata/00000-e77a2d71-ee13-4bd7-89a8-2450650d5249.metadata.json

failed to write the parquet file.
s3://polaris-sg/sg_db1/table1/data/00000-0-49bf02f5-8632-45b1-be33-95e1658e3a33-0-00001.parquet

The polaris-sg bucket is on s3 compatible storage, not on AWS S3.

lefebsy · 2024-12-12T19:24:48Z

Correct, the RoleArn is not used/implemented.
The STS will still be valid only for the ressource queried by spark from the catalog (a dedicated policy is build for each query).

Could you explain what's the purpose of roleArn in your use case outside AWS ?

ang6300 · 2024-12-12T22:15:12Z

Thank you lefebsy.
End user wants to use roleArn for security reason.

lefebsy · 2024-12-13T07:56:40Z

Hello @ang6300,

Roles are always for security reasons ;)
I am asking this to understand wich test could be ok to check global security behavior expected, in case I add roleArn in this implementation.
Adding this parameter is quite easy. Testing if it's correctly done is harder.

You can try again.

I have updated the code with roleArn. Be carefull I've also reformated parameters in CamelCase :

ang6300 · 2024-12-14T02:50:04Z

Hello @lefebsy,
Thank you very much for making the changes.
I will redo the testing next week.

c-thiel added the enhancement New feature or request label Jul 31, 2024

snazy mentioned this issue Aug 1, 2024

[FEATURE REQUEST] Support for S3 compatible services #60

Closed

annafil added this to Basic Kanban Board Aug 1, 2024

annafil moved this to Triage in Basic Kanban Board Aug 1, 2024

dimas-b mentioned this issue Aug 12, 2024

[BUG] Incorrect usage of the X-Iceberg-Access-Delegation header #146

Closed

1 task

lefebsy mentioned this issue Dec 16, 2024

[FEATURE REQUEST #32] On-Premises S3 / S3 Compatible... #389

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE REQUEST] On-Premise S3 & Remote Signing #32

[FEATURE REQUEST] On-Premise S3 & Remote Signing #32

c-thiel commented Jul 31, 2024

mjf-89 commented Aug 1, 2024

chris922 commented Aug 1, 2024

guitcastro commented Aug 1, 2024

mjf-89 commented Aug 3, 2024

guitcastro commented Aug 5, 2024

dimas-b commented Aug 5, 2024 •

edited

Loading

snazy commented Aug 5, 2024

mjf-89 commented Aug 5, 2024

c-thiel commented Aug 7, 2024

mjf-89 commented Aug 7, 2024

dimas-b commented Aug 14, 2024

dimas-b commented Aug 14, 2024 •

edited

Loading

mjf-89 commented Aug 16, 2024

heartblast commented Oct 17, 2024 •

edited

Loading

mmgaggle commented Oct 17, 2024 •

edited

Loading

mmgaggle commented Oct 17, 2024 •

edited

Loading

lefebsy commented Oct 18, 2024 •

edited

Loading

lefebsy commented Oct 18, 2024 •

edited

Loading

mmgaggle commented Oct 29, 2024 •

edited

Loading

lefebsy commented Nov 5, 2024 •

edited

Loading

ang6300 commented Dec 11, 2024 •

edited

Loading

lefebsy commented Dec 12, 2024 •

edited

Loading

ang6300 commented Dec 12, 2024

lefebsy commented Dec 12, 2024

ang6300 commented Dec 12, 2024

lefebsy commented Dec 13, 2024 •

edited

Loading

ang6300 commented Dec 14, 2024

[FEATURE REQUEST] On-Premise S3 & Remote Signing #32

[FEATURE REQUEST] On-Premise S3 & Remote Signing #32

Comments

c-thiel commented Jul 31, 2024

mjf-89 commented Aug 1, 2024

chris922 commented Aug 1, 2024

guitcastro commented Aug 1, 2024

mjf-89 commented Aug 3, 2024

guitcastro commented Aug 5, 2024

dimas-b commented Aug 5, 2024 • edited Loading

snazy commented Aug 5, 2024

mjf-89 commented Aug 5, 2024

c-thiel commented Aug 7, 2024

mjf-89 commented Aug 7, 2024

dimas-b commented Aug 14, 2024

dimas-b commented Aug 14, 2024 • edited Loading

mjf-89 commented Aug 16, 2024

heartblast commented Oct 17, 2024 • edited Loading

mmgaggle commented Oct 17, 2024 • edited Loading

mmgaggle commented Oct 17, 2024 • edited Loading

lefebsy commented Oct 18, 2024 • edited Loading

lefebsy commented Oct 18, 2024 • edited Loading

mmgaggle commented Oct 29, 2024 • edited Loading

lefebsy commented Nov 5, 2024 • edited Loading

ang6300 commented Dec 11, 2024 • edited Loading

lefebsy commented Dec 12, 2024 • edited Loading

ang6300 commented Dec 12, 2024

./run_spark_sql.sh [S3-location AWS-IAM-role]

- [AWS-IAM-role] - The AWS IAM role for catalog to assume when accessing the S3 location.

./run_spark_sql.sh s3://my-bucket/path arn:aws:iam::123456789001:role/my-role

lefebsy commented Dec 12, 2024

ang6300 commented Dec 12, 2024

lefebsy commented Dec 13, 2024 • edited Loading

ang6300 commented Dec 14, 2024

dimas-b commented Aug 5, 2024 •

edited

Loading

dimas-b commented Aug 14, 2024 •

edited

Loading

heartblast commented Oct 17, 2024 •

edited

Loading

mmgaggle commented Oct 17, 2024 •

edited

Loading

mmgaggle commented Oct 17, 2024 •

edited

Loading

lefebsy commented Oct 18, 2024 •

edited

Loading

lefebsy commented Oct 18, 2024 •

edited

Loading

mmgaggle commented Oct 29, 2024 •

edited

Loading

lefebsy commented Nov 5, 2024 •

edited

Loading

ang6300 commented Dec 11, 2024 •

edited

Loading

lefebsy commented Dec 12, 2024 •

edited

Loading

lefebsy commented Dec 13, 2024 •

edited

Loading