Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Api plus Aurora Serverless v2 #6250

Merged

Conversation

max-webster
Copy link
Collaborator

@max-webster max-webster commented Mar 15, 2024

This pull request updates the Aurora lending library example for Python, the Aurora item tracker example for Python, and the resources/cdk/aurora_serverless_app CloudFormation and CDK setup code. All with changes
to get everything to work with Aurora Serverless v2 and PostgreSQL instead of MySQL.

The Postgres engine switch is needed because currently, Data API for Serverless v2 only supports Postgres.

The PHP and Java examples that use this same resources/cdk/aurora_serverless_app resource setup should continue to work because I already added PostgreSQL-specific setup info into the READMEs. The actual SQL in those examples is generic and should work equally in PostgreSQL or MySQL.

Addresses #5722.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@github-actions github-actions bot added C++ This issue relates to the AWS SDK for C++ DotNet-v3 This issue relates to the AWS SDK for .NET V3 Go-v1 This issue relates to the AWS SDK for Go V1 Go-v2 This issue relates to the AWS SDK for Go V2 Java-v1 This issue relates to the AWS SDK for Java V1 Javascript-v2 This issue relates to the AWS SDK for Javascript V2 Javascript-v3 This issue relates to the AWS SDK for Javascript V3 CLI Relates to the AWS CLI GitHub settings This affects GitHub settings Tools This issue relates to a custom tooling to streamline development tasks labels Mar 15, 2024
@max-webster
Copy link
Collaborator Author

max-webster commented Mar 15, 2024

This branch has conflicts that must be resolved

Hmm I did pull down main on March 15 and did all the conflict resolution with my files in the submitted branch. That was e9aa5e2. I expect some strangeness with python/cross_service/aurora_rest_lending_library/rds_tools/aurora_tools.py because it was deleted in a recent commit but I put it back. The other files that are listed are the ones that I conflict-resolved as part of my git rebase on the branch.

@max-webster
Copy link
Collaborator Author

I had to do a lot more in aurora_serverless_app/setup.ts than I wanted - retrieve the default VPC, get the AWS account ID and AWS Region out of environment variables, assert that it was OK to have public subnets in the VPC. And still there was a warning about the egress rules in the security group. It's possible that the CDK part could be simplified a bit and made more consistent with the setup.yaml file that I modified independently.

@max-webster
Copy link
Collaborator Author

The commit that caused the change in all the conflicting files is:

dac6a42 Python - SecretsManager : Remove snippets from Code Library & delete unused files (#5797)

which also touches a number of files outside the directories I'm concerned with.

Copy link
Contributor

@ford-at-aws ford-at-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to not change 5,000+ files. Just the files you modified.

@DavidSouther
Copy link
Contributor

Hi, Max - not sure how this got in this state, but this should untangle you:

git remote add aws-doc-sdk-examples [email protected]:awsdocs/aws-doc-sdk-examples.git
git switch master
git fetch aws-doc-sdk-examples main
git reset --hard aws-doc-sdk-examples/main
git switch -c tmp
git cherry-pick e9aa5e2
git cherry-pick 28bcd16
git cherry-pick 1456bb1
git switch data_api_plus_serverless_v2_final_final
git reset --hard tmp
git push --force

That should end up with a branch with just the three commits from Friday. If there were more that I missed in the log, of course add their cherry-picks in order. Here's the copy I made: https://github.com/awsdocs/aws-doc-sdk-examples/compare/main...DavidSouther:aws-doc-sdk-examples:tmp_max_webster?expand=1

@max-webster
Copy link
Collaborator Author

I don't think those 3 cherry-picks have everything. This bottom one should include about 10 files:

commit 2c122793bcd7a47e41535afea5570195fb946e90
Author: John Russell <[email protected]>
Date:   Fri Mar 15 13:35:08 2024 -0700

    Bring back bits of code needed for the Serverless v2 mods to work
    
    Some setup code and helper functions were deleted in an interim commit
    while these changes were in progress.

:100644 100644 dc3465636 ba411ffd6 M    python/cross_service/aurora_rest_lending_library/library_demo.py

Maybe those changes got diverted elsewhere in the history when I did the conflict resolution with

dac6a4224... Python - SecretsManager : Remove snippets from Code Library & delete unused files (#5797)

@DavidSouther
Copy link
Contributor

Gotcha. In that case, you'll need to find them in your local clone's history? They're likely still findable in your reflog, though they may be hiding. Let me know if you'd like to schedule some pair programming time to see if we can find them?

@max-webster max-webster force-pushed the data_api_plus_serverless_v2_final_final branch from 1456bb1 to a78fff7 Compare March 19, 2024 18:25
@github-actions github-actions bot removed C++ This issue relates to the AWS SDK for C++ DotNet-v3 This issue relates to the AWS SDK for .NET V3 Go-v1 This issue relates to the AWS SDK for Go V1 Go-v2 This issue relates to the AWS SDK for Go V2 Java-v1 This issue relates to the AWS SDK for Java V1 Javascript-v2 This issue relates to the AWS SDK for Javascript V2 Javascript-v3 This issue relates to the AWS SDK for Javascript V3 CLI Relates to the AWS CLI GitHub settings This affects GitHub settings Tools This issue relates to a custom tooling to streamline development tasks labels Mar 19, 2024
@github-actions github-actions bot added Python This issue relates to the AWS SDK for Python (boto3) CDK Relates to the AWS Cloud Development Kit (CDK) labels Mar 19, 2024
…lity with Aurora Serverless v2

Because both of these examples make use of Data API, and Data API for Serverless v2
currently supports Aurora PostgreSQL but not Aurora MySQL, modernizing to Serverless v2
also means switching database engines to a recent major version of Aurora PostgreSQL.

Adapt Lending Library for using Serverless v2 in the demo:

* Update wording in README to focus less on Serverless cluster notion.
* Enhance the code that implements "wait for cluster to be created"
  Make it also wait for the associated DB instance.
* Create a PostgreSQL wrapper for the library app equivalent to the MySQL one.
  Using PostgreSQL data types, SQL idioms, and in particular the RETURNING
  clause for INSERT statements as the way to get back the auto-generated ID
  value(s) from the new rows.

Substantive changes are mainly around auto-increment columns.
Instead of adding the AUTO_INCREMENT keyword into the column definition,
the autoincrement attribute in the Python source just indicates which
column(s) to skip in INSERT statements. It's the responsibility of the
caller to specify one of the PostgreSQL auto-increment types like
SERIAL or BIGSERIAL.

* Add debugging output to see what's being executed for batch SQL statements.
* Add more debugging code around interpretation of results from batch SQL statement.
* Make the insert() operation use a RETURNING * clause.
  Since Data API for PostgreSQL doesn't include data in the generatedFields
  pieces in the result set.
* Make the INSERT statement for the Authors table work from a single big string.
  Supply all the VALUES data as part of the original SQL string and submitted to
  ExecuteStatement, not an array of parameters used with BatchExecuteStatement.

If the VALUES string is constructed with single-quoted values,
e.g. ('firstname','lastname'), then it's vulnerable to a syntax
error for names like O'Quinn that contain a single quote.
So I'm making the delimiters be $$first$$ and $$last$$ to
avoid any possibility of collisions.

* Add some more debugging output around submitting SQL to lend or return books.
  Also exception/debug/tracing code to verify
  exactly which operations fail and what the parameters
  and return values are around that vicinity.

* Change from IS NULL to IS NOT DISTINCT FROM NULL in get_borrowed_books() and return_book().
  Because the substitution at the point of 'null' isn't allowed
  in Postgres with JDBC protocol. Even though it is allowed in MySQL.

* Be more flexible in date/time-related types that get the DATE type hint.
  Don't cast today's date to a string, so it's recognized as a date/time type.

Set up CDK setup path for the cross-service resources to use Serverless v2 instead of Serverless v1:

* Create DatabaseCluster instead of ServerlessCluster.
* Include the 'enable Data API' / 'enable HTTP endpoint' flag.
  Added recently to DatabaseCluster in this pull request: aws/aws-cdk#29338
* Updated the CDK version to 2.132.1, a recent enough version that includes that ^^^ pull request.
* Switch from instanceProps to writer as the attribute of the DatabaseCluster. Based on deprecation
  of instanceProps that happened in Nov 2023.
* Changes to VPC and subnets to get example to work with existing network resources.
  Had to boost VPC and VPC subnets attributes out of the instance and up to the cluster level.
* Switched to Serverless v2 instance for the writer. Now serverless vs. provisioned is a
  different method call instead of asking for a different instance class.

Reformat Python code with 'black' linter.

For the aurora_serverless_app CloudFormation stack:

* Make the cluster and associated instance be Aurora PostgreSQL version 15
* Update CFN stack to refer to the Serverless v2 scaling config attribute.
* Add MinCapacity and MaxCapacity fields with numeric values.
* Take out Autopause attribute.
* See: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-rds-dbcluster-serverlessv2scalingconfiguration.html
* Take out engine mode entirely for cluster. (Serverless v2 uses provisioned engine mode, which is the default.)
* Add RDSDBInstance section. In Serverless v2, the Aurora cluster does have DB instances.
  It's the DB instances that are marked as serverless via the 'db.serverless' instance class.
* Add a DependsOn attribute so instance creation waits for the cluster to be created first.

In the Python item tracker code:

* Apply same Serverless v2 + Data API change to INSERT statement as in PHP and Java examples, which
  were updated in earlier pull requests.
* Turn the DDL statement into a query. Get the auto-generated ID value back from
  "records" instead of "generatedFields".
@max-webster max-webster force-pushed the data_api_plus_serverless_v2_final_final branch from a78fff7 to 508a9c8 Compare March 19, 2024 20:04
@ford-at-aws
Copy link
Contributor

ford-at-aws commented Mar 19, 2024

Question - what happened with this PR that you said would resolve the need for using the SDK to create resources that can be created by the CDK?

I don't want to add scripts for creating resources unless we absolutely have to in order to showcase the feature you want to show.

@max-webster
Copy link
Collaborator Author

I used the extra DatabaseCluster attribute in resources/cdk/aurora_serverless_app/setup.ts to illustrate how to set up an Aurora Serverless v2 cluster + instance. In python/cross_service/aurora_rest_lending_library, I left the original Python setup code so that that example had the same constituent pieces as before. With Serverless v2, there is very little that's different on the application code side - the big difference is in the setup and teardown logic. So between the 2 examples (Lending Library with API calls and Item Tracker with CFN and CDK), I wanted to keep demonstrations of all the places you would take out the old keywords, function calls, and assumptions from Serverless v1 and what you would change them to in Serverless v2. Otherwise the crucial facts that people need to learn are abstracted and glossed over.

@max-webster
Copy link
Collaborator Author

I don't mind adding in an alternative setup mechanism as its own discrete change. I didn't delete the config.yml file from the Lending Library example. But let's work out the CDK best practices first with the Item Tracker updates, because that project was already using CDK. I had to make more changes in resources/cdk/aurora_serverless_app/setup.ts than I really wanted to related to VPCs and subnets. Maybe the result could be improved to handle all edge cases. (What if the user doesn't have environment variables declared for default account and region, what if the user doesn't have any VPCs or security groups set up already, what if the user is already at their account/region limit of 5 VPCs.)

@ford-at-aws ford-at-aws added On Call Review needed This work needs an on-call review Task A general update to the code base for language clarification, missing actions, tests, etc. labels Mar 20, 2024
@rlhagerm rlhagerm added On Call Review complete On call review complete and removed On Call Review needed This work needs an on-call review labels Mar 21, 2024
@rlhagerm rlhagerm merged commit 9672b6e into awsdocs:main Mar 21, 2024
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Relates to the AWS Cloud Development Kit (CDK) On Call Review complete On call review complete Python This issue relates to the AWS SDK for Python (boto3) Task A general update to the code base for language clarification, missing actions, tests, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants