Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST #32] On-Premises S3 / S3 Compatible... #389

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

lefebsy
Copy link

@lefebsy lefebsy commented Oct 21, 2024

Description (edited) :

This is a proposition of Polaris core storage implementation, copy of the aws + new parameters : endpoint, path style...

  • By default it is trying to respect the same behavior about credentials than AWS (IAM STS). The same dynamic policy is applied, limiting the scope to the data queried. This is tested and is working with MinIO, and should works also with Dell ECS, NetApp StorageGRID, etc...

  • Otherwise if STS is not available 'Skip_Credential_Subscoping_Indirection' = true will disabling Polaris "SubScoping" of the credentials

Let me know your opinion about this design proposal.
Thank you

Included Changes:

  • New type of storage "S3_COMPATIBLE".
  • Tested against MinIO with self-signed certificate
  • regtests/run_spark_sql_s3compatible.sh

Type of change:

  • Bug fix (non-breaking change which fixes an issue)
  • Documentation update
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

Please delete options that are not relevant.

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • If adding new functionality, I have discussed my implementation with the community using the linked GitHub issue

@lefebsy lefebsy changed the title add s3 compatible storage - first commit [FEATURE REQUEST] On-Premise S3... #32 Oct 21, 2024
@lefebsy lefebsy changed the title [FEATURE REQUEST] On-Premise S3... #32 [FEATURE REQUEST #32] On-Premise S3... Oct 21, 2024
@lefebsy lefebsy changed the title [FEATURE REQUEST #32] On-Premise S3... [FEATURE REQUEST #32] On-Premises S3... Oct 21, 2024
@lefebsy

This comment was marked as outdated.

@@ -901,6 +903,58 @@ components:
required:
- roleArn

S3StorageConfigInfo:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this specific to s3compat, or is it also meant to be used for s3 itself?

Copy link
Author

@lefebsy lefebsy Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation focuses on OnPrem S3 because there is already the AWS class..

However, in a next step I can try to let it work seamlessly with AWS too :

  • I think that roleArn is not mandatory for AWS S3, so let it to the existing implementation for this scenario
  • Using access and secret key should work with AWS S3 too
  • I have overrided the AWS STS endpoint with S3 endpoint. I could add a modification, maybe with a STS endpoint property... something like "if property is empty" -> "STSclient call AWS default STS endpoint" else -> "STS client call the endpoint setted" or a boolean with a clear and explicit description
  • Region, (maybe little more reflexion is needed to avoid conflict)
    • Add region property
    • I have removed the cross region tweak of the AWS FileIOClientFactory, it can be kept to assure a full compatibility

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can unify these, I think that would be ideal. But I don't know enough about how S3 vs S3Compatible are similar/different to say how possible that is.

Copy link
Author

@lefebsy lefebsy Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will not be too hard to unify, in a next step. I miss AWS access to do tests, but for what I know or seen :

  • STS endpoint is a AWS specific endpoint, in S3 compatible solutions, when it is available, it is merged with S3 endpoint.
  • Region is available in S3 compatible solutions, but not used a lot, or mostly implemented to be compliant with aws sdk clients.

MinIO claim that their product API is 100% compatible with AWS S3 API. Almost the same for many alternatives...

  • The S3 Compatible implementation could easily propose an optional parameter "arnRole" like the mandatory one in the existing aws class, with less regexp patern to allow more flexibility for some implementation where "aws" inside the string is replaced by the product name (exemple "ecs" for DELL ECS)... It could help for a smooth transition

enum:
- TOKEN_WITH_ASSUME_ROLE
- KEYS_SAME_AS_CATALOG
- KEYS_DEDICATED_TO_CLIENT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work? What identifies a client?

Copy link
Author

@lefebsy lefebsy Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here client is anything trying to obtain keys (or security token service) from this catalog (spark, trino,...). There is no particular distinction of identity.

This is not the right term to use in the context of Polaris?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the term is correct, I was just stuck trying to understand how the service will track which keys are dedicated to which client.

Copy link
Author

@lefebsy lefebsy Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

It's simply one key for catalog itself, then another unique key for any clients whoever they are. I Let client distinction to the principal/role/privilege level. I think it is hard at the class storage/credential level to stick a pair of keys to each different clients.

It is a basic way, when SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION is True and there is not a temporary token, to not divulge internal catalog key and serve a key that can be deactivated or rotated for security concerns without breaking catalog itself.

After discussing with MonkeyCanCode here
Prod Deployment credentials the advantage in this proposal is that you have not to rely on the main credentials provided at the global Polaris service level.

Today if you revoke the Polaris service credentials for AWS, all catalogs with AWS storages are instantly KO.

In this implementation each catalog is independent. It is the same idea about clients keys, to not breaking catalog when clients keys are revoked or rotated fo security reasons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have not to rely on the main credentials provided at the global Polaris service level.

I think this is the key point here. I agree that the experience you describe is bad, but I'm not sure that fixing it should be a blocker for s3compat support (or that this is the right fix).

Would you be okay saving this for later, or carving it out into a different PR? In my view relying on the global credentials in production is universally a bad idea, regardless of what STORAGE_TYPE you're using.

@lefebsy lefebsy changed the title [FEATURE REQUEST #32] On-Premises S3... [FEATURE REQUEST #32] On-Premises S3 / S3 Compatible... Nov 5, 2024
Copy link
Contributor

@collado-mike collado-mike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I don't understand what isn't supported today with catalog-properties. E.g., in https://github.com/apache/polaris/blob/main/polaris-service/src/test/java/org/apache/polaris/service/catalog/PolarisSparkIntegrationTest.java , we use S3MockContainer as an S3 endpoint, which requires the same path-style access and custom enpdoint configuration as what's included here. Can we not follow the same pattern for minio?

As a rule, I think vending static credentials is not a good idea. Some customization for how the STS client is instantiated, possibly with support for custom profiles for different catalogs could make sense. But I think, ultimately, the credentials returned should always be a temporary session token. Even if we just call GetSessionToken without requiring an IAM role, it would vastly more secure than sending raw credentials.

Comment on lines +84 to +80
propertiesMap.put(PolarisCredentialProperty.AWS_ENDPOINT, storageConfig.getS3Endpoint());
propertiesMap.put(
PolarisCredentialProperty.AWS_PATH_STYLE_ACCESS,
storageConfig.getS3PathStyleAccess().toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are catalog properties, not credential-vending properties. These should be set at the catalog-level when it is created. Those properties would then be passed into the FileIO when it is constructed.

Copy link
Author

@lefebsy lefebsy Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be refactor for satisfying change to requested : boolean "SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION".
I will try to find a way to move it in the catalog properties. But catalog properties are not forwarded to "S3CompatibleCredentialsStorageIntegration.java", only storage properties by default.

Copy link
Author

@lefebsy lefebsy Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failed to use catalog porperties, they are not forwarded to this class

Comment on lines 50 to 53
public void createStsClient(S3CompatibleStorageConfigurationInfo s3storageConfig) {

LOGGER.debug("S3Compatible - createStsClient()");
StsClientBuilder stsBuilder = software.amazon.awssdk.services.sts.StsClient.builder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this constructed here rather than being passed in as a constructor parameter?

Copy link
Author

@lefebsy lefebsy Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like AWS class ?
Find it weird to put something related to a storage type outside the class and provided by the constructor. Seems Azure class is keeping it inside the class too.
No ?

import software.amazon.awssdk.services.sts.model.AssumeRoleRequest;
import software.amazon.awssdk.services.sts.model.AssumeRoleResponse;

/** Credential vendor that supports generating */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment seems to just ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the unchanged comment copied from AWS class. Removed.

Comment on lines 115 to 116
String cli = System.getenv(storageConfig.getS3CredentialsClientAccessKeyId());
String cls = System.getenv(storageConfig.getS3CredentialsClientSecretAccessKey());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems you could rely on the DefaultCredentialsProvider and maybe allow profiles to be specified? This would allow for env variables, but also file configuration or other means of retrieving credentials.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not find how to conciliate this with createCatalog() REST API... And not happy with the idea to let this credentials catalog's fully and exclusively at Polaris service level.
Bad compromise ?

@lefebsy
Copy link
Author

lefebsy commented Nov 6, 2024

I guess I don't understand what isn't supported today with catalog-properties. E.g., in https://github.com/apache/polaris/blob/main/polaris-service/src/test/java/org/apache/polaris/service/catalog/PolarisSparkIntegrationTest.java , we use S3MockContainer as an S3 endpoint, which requires the same path-style access and custom enpdoint configuration as what's included here. Can we not follow the same pattern for minio?

Not found how endpoint or other properties can be set from REST API create or update catalog. Only arnRole is accepted as mandatory parameter today in AWS storage type.

As a rule, I think vending static credentials is not a good idea. Some customization for how the STS client is instantiated, possibly with support for custom profiles for different catalogs could make sense. But I think, ultimately, the credentials returned should always be a temporary session token. Even if we just call GetSessionToken without requiring an IAM role, it would vastly more secure than sending raw credentials.

I agree, by default now it is STS "AssumeRole" without any "role". Raw credentials are poor fallback scenario when STS are not available. There is enterprise context where STS, assumeRole etc are not allowed. Only pair of keys are available. By example, Dell ECS require additional policy to enable STS AssumeRole. It's not activated out of the box. I tried to be explicit about this degraded security pattern.

"GetSessionToken" is not part of S3 API, it is IAM API. It is not available in MinIO Unsupported action GetSessionToken (Service: Sts, Status Code: 400...), and not sure about other products. For the moment I've only find "AssumeRole" as IAM STS API method often implemented.

@jean-humann
Copy link

Hello everyone this PR seems to be blocked for a month now, is there anything we can do to make it to the end ? 🙏

@lefebsy
Copy link
Author

lefebsy commented Dec 11, 2024

Hello everyone this PR seems to be blocked for a month now, is there anything we can do to make it to the end ? 🙏

Sorry for the one-month break. I tried the approaches proposed in the comments.
Refactoring performed ;)

@lefebsy
Copy link
Author

lefebsy commented Dec 16, 2024

Refactored after many comments :

  • SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION is replacing the initial 'Strategy options' as requested by @eric-maynard

  • Come back of RoleARN as optional parameter, because requested in the issue [FEATURE REQUEST] On-Premise S3 & Remote Signing #32

  • Added 'Region for client' to be aligned with last AWS implementation modifications

  • Removed the credentials parameters 'by value', only env var name are possible. If parameters are empty, it will fallback to default AWS variables

  • @collado-mike comments

    • about 'at least a session token' :
      seems to be not available in the S3 compatible softwares I've tested, or I missed the information - only AssumeRole (with empty role or not) is often implemented
    • about the SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION should be in the catalog properties and not in the storage porperties :
      Polaris is not forwarding the catalog properties where it is needed. Maybe it can be adressed later in other PR ?
    • STS client created inside the core/storage classes, not pass via constructor :
      Looks similar to the Azure implementation, not the AWS one. I have avoid to modify the 'Service' code in this PR, modifications are limited to REST specs, core Storage and regTest...

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants