Disable override client-side settings on Athena workgroups #5703
Replies: 2 comments
-
Wording of the docs on disabling Override client-side settings: |
Beta Was this translation helpful? Give feedback.
-
I was thinking about what @gwionap said yesterday, specifically that if we go with the alternative and enable override client settings on primary it will mean that users will be able to access each other query results because data will be saved to the same location. I remembered that I did some testing with awswrangler which saves results to the default
I got an alpha user to try to access the file and he couldn't access it, even though he has get/put access to aws-athena-query-results-*:
so maybe Athena is able to somehow ensure only the right people can access query results? It would require further testing |
Beta Was this translation helpful? Give feedback.
-
Note
Using the proposed RFC template
Need (Context / Problem)
Athena Workgroups
Athena uses workgroups to separate workloads, control team access, enforce configuration, and track query metrics and control costs. Primary is the default workgroup. We have created custom workgroups for dbt and Airflow to make it easier to track costs and debug issues.
primary and the dbt athena workgroups have the override client side settings turned off. This means that Athena uses the client's settings for all queries that run in the workgroup, including the settings for expected bucket owner, encryption, control of objects written to the query results bucket and query results location.
Query Result Location
Athena automatically stores query results to a query result location that you can specify in Amazon S3.
We customise the Athena query output location to
s3://mojap-athena-query-dump/{principal-id}
in pydbtools. For example:gets written to
s3://mojap-athena-query-dump/AROAYUIXP4BWSZH6CWPHR:airflow_dev_example_role
when run from airflow using theairflow_dev_example_role
role (see use_kubernetes_athena.py) and the defaultprimary
workgroup.All alpha users automatically get access to their respective mojap-athena-query-dump folder in iam-builder.
This is inconsistent with the custom Airflow Athena workgroups.
Airflow Athena Workgroups
The custom Athena workgroups for Airflow currently overrides client side settings. The query output location is set to
s3://mojap-athena-query-dump/
and we have manually addeds3://mojap-athena-query-dump/
to all airflow roles which use a custom workgroup. This is quite permissive but since it's the airflow role we were less worried about granting it access to the entire bucket.We tested disabling the override (see PR), so that we could carry on writing to
s3://mojap-athena-query-dump/{principal-id}
.However Trivy, an open source security scanner, raised this error:
However I think it's impossible to disable encryption settings.
Disabling encryption settings
According to https://docs.aws.amazon.com/athena/latest/ug/workgroups-settings-override.html If Override client-side settings is not selected, workgroup settings are not enforced at the client level, including encryption.
This means that when I run this query, I should be saving the result without encryption:
However the result is still saved with the default SSE-S3 S3 managed keys
I have tested to make sure that the encryption parameter is working as expected by setting encryption type to SSE_KMS:
My suspicion is that since Amazon S3 now applies server-side encryption with SSE-S3 as the base level of encryption, it's now impossible to disable encryption settings. I have raised an AWS support ticket to confirm this.
Approach
Evaluation
Pros
s3://mojap-athena-query-dump/{principal-id}
which makes it easier to find and assign permissionsCons
Alternatives
1. Enable override client-side settings
Enable the override client-side settings on all Athena workgroups, including primary. This means that query results will have to be saved to a single location per workgroup, for example
s3://mojap-athena-query-dump/{workgroup}
Pros
Cons
2. Stay AS-IS
Use the As-Is solution so Enable the override client-side settings on airflow Athena workgroups only, and save query results to
s3://mojap-athena-query-dump/
Pros
Cons
s3://mojap-athena-query-dump/
bucketBeta Was this translation helpful? Give feedback.
All reactions