Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-glue-alpha: Creating database uses account id at synthesis time #32720

Open
1 task
axelson opened this issue Jan 3, 2025 · 4 comments
Open
1 task

aws-glue-alpha: Creating database uses account id at synthesis time #32720

axelson opened this issue Jan 3, 2025 · 4 comments
Labels
@aws-cdk/aws-glue Related to AWS Glue bug This issue is a bug. effort/small Small work item – less than a day of effort p2

Comments

@axelson
Copy link

axelson commented Jan 3, 2025

Describe the bug

I am using CDK (via the aws-cdk.aws-glue-alpha python package) to publish a template that users will run. In that template I'm instantiating a aws_glue_alpha.Database instance. When I use CDK to synthesize the template the account id ends up hard-coded in the synthesized template. I'm relatively sure that this comes from the cdk.Stack.of(this).account here:

catalogId: cdk.Stack.of(this).account,

Regression Issue

  • Select this option if this issue appears to be a regression.

Last Known Working CDK Version

none

Expected Behavior

Instead I expect the catalog to be created under the user that has created the CDK stack (not the user that published the template)

Current Behavior

Since the catalog is created that references the wrong AWS account I get an error like:

Resource handler returned message: "User: arn:aws:iam::550533133XYZ:root is not authorized to perform: glue:CreateDatabase on resource: arn:aws:glue:us-west-2:692859912XYZ:catalog because no resource-based policy allows the glue:CreateDatabase action (Service: Glue, Status Code: 400, Request ID: deae901b-79c4-4f19-843e-4a40b30ebed5)" (RequestToken: 08d0eab1-4651-0c55-d8c9-3aa6c38a87cb, HandlerErrorCode: AccessDenied)

I've tested that Aws.ACCOUNT_ID should resolve the error by dropping down to the level 1 construct CfnDatabase. If instantiate that with catalog_id=aws_cdk.Stack.of(self).account then I get the error, but if I instantiate it with catalog_id=Aws.ACCOUNT_ID then the when I deploy the published stack I get a new catalog which is what I want.

Reproduction Steps

Here's the python CDK stack:

from aws_cdk import (
    aws_glue,
    aws_glue_alpha,
)

class FakeStack(aws_cdk.NestedStack):
    """Fake stack to reproduce the error quicker"""

    construct_id: str

    def __init__(
        self,
        scope: constructs.Construct,
        construct_id: str,
        **kwargs,
    ) -> None:
        self.construct_id = construct_id

        super().__init__(
            scope,
            construct_id,
            description=f"{construct_id} nested fake pipeline stack",
            **kwargs,
        )

        # This bakes in the publishing accont id
        aws_glue_alpha.Database(
            self,
            f"{self.construct_id}-database",
            database_name=f"{self.construct_id}-nested-database".replace("-", "_"),
        )

        # This uses the deploying account id
        # aws_glue.CfnDatabase(
        #     self,
        #     f"{self.construct_id}-database2",
        #     # unless we use this
        #     # catalog_id=aws_cdk.Stack.of(self).account,
        #     catalog_id=Aws.ACCOUNT_ID,
        #     database_input=aws_glue.CfnDatabase.DatabaseInputProperty(
        #         name=f"{self.construct_id}-nested-database".replace("-", "_")
        #     ),
        # )

if __name__ == "__main__":
    app = aws_cdk.App()

    FakeStack(app)

    app.synth()

It can be used with:

cdk synth
cdk-assets publish

Then, when logged in as a separate account, use the generated template to create a new stack via the CloudFormation UI (or with aws cloudformation create-stack and passing in the template)

Possible Solution

I think instead that the account for the catalog should be referenced via Aws.ACCOUNT_ID

Additional Information/Context

No response

CDK CLI Version

2.146.0 (build b368c78) (but I've tested v2.172.0 also)

Framework Version

No response

Node.js Version

v22.12.0

OS

GitHub actions ubuntu-latest (but also reproduced on macOS 15.1.1)

Language

Python

Language Version

Python (3.1.2)

Other information

No response

@axelson axelson added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jan 3, 2025
@github-actions github-actions bot added the @aws-cdk/aws-glue Related to AWS Glue label Jan 3, 2025
@khushail khushail added needs-reproduction This issue needs reproduction. p2 and removed needs-triage This issue or PR still needs to be triaged. labels Jan 3, 2025
@khushail khushail self-assigned this Jan 3, 2025
@ashishdhingra ashishdhingra added investigating This issue is being investigated and/or work is in progress to resolve the issue. and removed needs-reproduction This issue needs reproduction. labels Jan 3, 2025
@ashishdhingra
Copy link
Contributor

@axelson If you deploy using CDK, then the Catalog Id would be properly set using the account ID that created CDK stack. Since you are using CDK to generate CFN template and using it manually to create stack via AWS CLI or CloudFormation console UI, you get the undesirable result. However, I understand your scenario.


Reproducible using code below:

import * as cdk from 'aws-cdk-lib';
import * as glue from '@aws-cdk/aws-glue-alpha';

export class GlueTestStack extends cdk.Stack {
  constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    new glue.Database(this, `TestGlue-database`, {
      databaseName: `TestGlue-nested-database`.replace('-', '_')
    });
  }
}

This synthesizes into below CFN template:

Resources:
  TestGluedatabase24986573:
    Type: AWS::Glue::Database
    Properties:
      CatalogId: "<<ACCOUNT-ID>>"
      DatabaseInput:
        Name: TestGlue_nested-database
    Metadata:
      aws:cdk:path: GlueTestStack/TestGlue-database/Resource
  CDKMetadata:
    Type: AWS::CDK::Metadata
    Properties:
      Analytics: v2:deflate64:H4sIAAAAAAAA/zXLTQqDMBBA4bNkn4w/DeiyYE9gD1DGONpoGsVJ6kK8e5Hq6i0eXw5ZcYNNC1xZmXZUzjawPQOaUeLKr95FgqrzDwzYIJOsOl8TT3ExtEut0M1vhFTcT58cPdB5LrdLP7UEAyffrIQ8BS0GtlYt0Qf7Iaj//QENPTP4jwAAAA==
    Metadata:
      aws:cdk:path: GlueTestStack/CDKMetadata/Default
Parameters:
  BootstrapVersion:
    Type: AWS::SSM::Parameter::Value<String>
    Default: /cdk-bootstrap/hnb659fds/version
    Description: Version of the CDK Bootstrap resources in this environment, automatically retrieved from SSM Parameter Store. [cdk:skip]

Account ID is set as hardcoded value of the current account for CatalogId property.

Per AWS::Glue::Connection, CatalogId should be the AWS account ID. To specify the account ID, we can use the Ref intrinsic function with the AWS::AccountId pseudo parameter. For example: !Ref AWS::AccountId.

The fix could be to use cdk.Aws.ACCOUNT_ID (or new cdk.ScopedAws(this).accountId) while setting catalogId here. This would use Ref: AWS::AccountId instead of hard-coded account ID. For example, using below CDK code:

import * as cdk from 'aws-cdk-lib';
import * as glue from '@aws-cdk/aws-glue-alpha';

export class GlueTestStack extends cdk.Stack {
  constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    new glue.Database(this, `TestGlue-database`, {
      databaseName: `TestGlue-${cdk.Aws.ACCOUNT_ID}-nested-database`.replace('-', '_')
    });
  }
}

generates below CDN template when synthesized (this uses Ref: AWS::AccountId pseudo parameter):

Resources:
  TestGluedatabase24986573:
    Type: AWS::Glue::Database
    Properties:
      CatalogId: "<<ACCOUNT-ID>>"
      DatabaseInput:
        Name:
          Fn::Join:
            - ""
            - - TestGlue_
              - Ref: AWS::AccountId
              - -nested-database
    Metadata:
      aws:cdk:path: GlueTestStack/TestGlue-database/Resource
  CDKMetadata:
    Type: AWS::CDK::Metadata
    Properties:
      Analytics: v2:deflate64:H4sIAAAAAAAA/zXLTQqDMBBA4bNkn4w/DeiyYE9gD1DGONpoGsVJ6kK8e5Hq6i0eXw5ZcYNNC1xZmXZUzjawPQOaUeLKr95FgqrzDwzYIJOsOl8TT3ExtEut0M1vhFTcT58cPdB5LrdLP7UEAyffrIQ8BS0GtlYt0Qf7Iaj//QENPTP4jwAAAA==
    Metadata:
      aws:cdk:path: GlueTestStack/CDKMetadata/Default
Parameters:
  BootstrapVersion:
    Type: AWS::SSM::Parameter::Value<String>
    Default: /cdk-bootstrap/hnb659fds/version
    Description: Version of the CDK Bootstrap resources in this environment, automatically retrieved from SSM Parameter Store. [cdk:skip]

@ashishdhingra ashishdhingra added effort/small Small work item – less than a day of effort and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. labels Jan 3, 2025
@ashishdhingra ashishdhingra removed their assignment Jan 3, 2025
@axelson
Copy link
Author

axelson commented Jan 4, 2025

@ashishdhingra Thank you for taking a look! I'm not sure if I fully understand your second code snippet. If I understand correctly that would result in the database name including the correct AWS account id, but the catalog_id would still use the incorrect account id. Am I following correctly?

@ashishdhingra
Copy link
Contributor

@ashishdhingra Thank you for taking a look! I'm not sure if I fully understand your second code snippet. If I understand correctly that would result in the database name including the correct AWS account id, but the catalog_id would still use the incorrect account id. Am I following correctly?

@axelson The 2nd snippet just demonstrates for reference purposes that using Aws.AccountId would emit Ref: AWS::AccountId pseudo parameter in synthesized CFN template (Currently, due to issue, it will still use hardcoded account ID in output CFN template).

So after CDK team fixes the issue, then if you take the template and deploy it using AWS CLI or CloudFormation UI (as you are doing currently), it should use account ID of account which deploys the stack.

@axelson
Copy link
Author

axelson commented Jan 5, 2025

Ah, I see. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-glue Related to AWS Glue bug This issue is a bug. effort/small Small work item – less than a day of effort p2
Projects
None yet
Development

No branches or pull requests

3 participants