Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: composer 3 serverless upgrade #182

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

eeaton
Copy link
Contributor

@eeaton eeaton commented Dec 19, 2024

Composer 3 is now GA, which effectively treats it as serverless (all Composer resources are in a producer project) so that we have fewer networking resources to manage and fewer potential GKE-related vulnerabilities flagged by SHA scanners n our own project.

Using Composer 3 will reduce some flaky CI failures we get in Composer 2, where Composer 2 randomly assigns IP with a race condition that occasionally conflicts with IP assigned elsewhere in terraform. Because Composer 3 is in a separate producer project, there are fewer significantly IP resources and possibility for clash on our own project.

In addition to the Composer 3 upgrade, this led to some incidental networking fixes / cleanup to remove networking components and TF variables that were used in Composer 2 but not Composer 3

@eeaton eeaton marked this pull request as draft December 19, 2024 20:10
…SC changes are poorly documented and lead to too much scope creep for this PR. Instead, just add an explicit firewall rule allowing egress to the PSA range
…DB, I can't find where the egress range to AlloyDB is defined.. Fixing this properly might require converting PSA to PSC.
@eeaton eeaton marked this pull request as ready for review December 20, 2024 17:58
@eeaton
Copy link
Contributor Author

eeaton commented Dec 20, 2024

For in-place upgrades: note that I've encountered some odd state behavior in Terraform.

│ on ../../components/dpu-workflow/terraform/main.tf line 78, in resource "google_composer_environment" "composer_env":
│ 78: enable_private_environment = true

│ An argument named "enable_private_environment" is not expected here.

This is a quirk that Composer2 and Composer3 support different sets of config arguments, but use the same TF resource. It is a valid config, but if Terraform state still sees Composer2 in the current environment it will reject the config for Composer3.

I tried a few things, not sure which did it but I eventually got it resolved:

  • terraform taint module.dpu_workflow.google_composer_environment.composer_env
  • manually delete composer and reapply
  • terraform init -upgrade
  • manually remove the .terraform folder

Copy link
Contributor

@eyalbenivri eyalbenivri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good - just a small question about leftover comments - if we're gonna leave commented code in, we should probably document why, otherwise we should probably get rid of it.

I also see alot of networking configuration removed - we don't need it? if not, great - I just want to make sure that this is not missing by accident.

components/dpu-workflow/terraform/variables.tf Outdated Show resolved Hide resolved
components/common-infra/terraform/vpc.tf Show resolved Hide resolved
components/common-infra/terraform/vpc.tf Show resolved Hide resolved
@eeaton
Copy link
Contributor Author

eeaton commented Jan 22, 2025

Thanks for the review Eyal, I've removed a few stray comments you flagged, and am waiting on the results of a triggered workflow both in my own environment and the automated CI environment (which now does the full functional workflow after #184 !) to verify that the removed firewall/DNS resources can be removed safely for Composer 3.

Once that is all green I'll nudge you again for the approval.

…Composer (these files were revived by merge) and 2: remove hardcoding in dpu-subnet module
…ently from Composer 2 and require directpath ranges in firewall rules 34.126.0.0/18
…en when the restricted VIP is configured... Until I can better troubleshoot that behavior as part of VPCSC implementation, I'll leave this with private vip
…wn firewall issues can be triggered by background processes of Composer, but some might be triggered only during the workflow. The second test after workflow will help identify network-related failures in the previous stage
@eeaton
Copy link
Contributor Author

eeaton commented Jan 24, 2025

@eyalbenivri I've now confirmed that everything passes with Composer 3 after removing networking resources. I validated that none of the Composer 2 requirements for firewall and DNS apply to composer 3, but I did identify an undocumented network path requirement addressed by adding allow-google-apis-directpath and raised a bug to improve Composer documentation. Also added functional tests to validate that Composer (or other services) are not triggering the default-deny firewall rules added in a recent PR and merged into this one.

Anyway, it's green now, PTAL

@eeaton eeaton requested a review from eyalbenivri January 24, 2025 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants