Skip to content

Conversation

@mtulio
Copy link
Contributor

@mtulio mtulio commented Jun 2, 2025

Updating the cloud-provider-aws and OpenShift clients to gather the NLB+SG feature, enabling the configuration to provision SGs for all NLBs through the sync transformer.

Ref: openshift/cloud-provider-aws#117

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 2, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 2, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@mtulio mtulio changed the title tmp/DNM: validating NLB+SG config DNM/SPLAT-2253: tmp validation of NLB+SG setup Jun 2, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 2, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jun 2, 2025

@mtulio: This pull request references SPLAT-2253 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set.

In response to this:

Bumping cloud-provider-aws are crashing, focusing in the change for now to be able to validate with cluster-bot.

this PR is created to be used with cluster-bot:

Ref: openshift/cloud-provider-aws#108

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio
Copy link
Contributor Author

mtulio commented Jul 24, 2025

/test all

@mtulio mtulio changed the title DNM/SPLAT-2253: tmp validation of NLB+SG setup DNM/SPLAT-2253: CCM-AWS config enforce to provision Service NLB with SG under gate Jul 24, 2025
@mtulio
Copy link
Contributor Author

mtulio commented Sep 10, 2025

PR rebased with upstream updates, and CCCMO FG support by #400

@mtulio
Copy link
Contributor Author

mtulio commented Sep 10, 2025

Next step: create a CI job to exercise this scenario.

@mtulio
Copy link
Contributor Author

mtulio commented Sep 10, 2025

/test ?

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 10, 2025

@mtulio: The following commands are available to trigger required jobs:

/test e2e-aws-ovn
/test e2e-aws-ovn-upgrade
/test fmt
/test images
/test lint
/test okd-scos-images
/test security
/test unit
/test vendor
/test verify-deps
/test vet

The following commands are available to trigger optional jobs:

/test e2e-azure-manual-oidc
/test e2e-azure-ovn
/test e2e-azure-ovn-upgrade
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-ibmcloud-ovn
/test e2e-nutanix-ovn
/test e2e-openstack-ovn
/test e2e-vsphere-ovn
/test level0-clusterinfra-azure-ipi-proxy-tests
/test okd-scos-e2e-aws-ovn
/test regression-clusterinfra-vsphere-ipi-ccm

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-cluster-cloud-controller-manager-operator-main-e2e-aws-ovn
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-e2e-aws-ovn-upgrade
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-e2e-azure-ovn
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-e2e-azure-ovn-upgrade
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-e2e-gcp-ovn
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-e2e-gcp-ovn-upgrade
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-e2e-openstack-ovn
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-e2e-vsphere-ovn
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-fmt
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-images
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-level0-clusterinfra-azure-ipi-proxy-tests
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-lint
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-okd-scos-e2e-aws-ovn
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-okd-scos-images
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-regression-clusterinfra-vsphere-ipi-ccm
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-security
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-unit
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-vendor
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-verify-deps
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-vet

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 11, 2025

@mtulio: This pull request references SPLAT-2253 which is a valid jira issue.

In response to this:

Bumping cloud-provider-aws are crashing, focusing in the change for now to be able to validate with cluster-bot.

this PR is created to be used with cluster-bot:

Ref: openshift/cloud-provider-aws#117

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio mtulio changed the title DNM/SPLAT-2253: CCM-AWS config enforce to provision Service NLB with SG under gate SPLAT-2253/WIP: CCM-AWS config enforce to provision Service NLB with SG under gate Sep 17, 2025
@openshift-ci-robot openshift-ci-robot removed the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 17, 2025
@openshift-ci-robot
Copy link

@mtulio: No Jira issue is referenced in the title of this pull request.
To reference a jira issue, add 'XYZ-NNN:' to the title of this pull request and request another refresh with /jira refresh.

In response to this:

Bumping cloud-provider-aws are crashing, focusing in the change for now to be able to validate with cluster-bot.

this PR is created to be used with cluster-bot:

Ref: openshift/cloud-provider-aws#117

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio
Copy link
Contributor Author

mtulio commented Sep 17, 2025

/payload-job ?

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 17, 2025

@mtulio: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.

@mtulio
Copy link
Contributor Author

mtulio commented Sep 17, 2025

/testwith openshift/cluster-cloud-controller-manager-operator/main/e2e-aws-ovn openshift/origin#30235 openshift/cloud-provider-aws#117

@deepsm007
Copy link

/testwith openshift/cluster-cloud-controller-manager-operator/main/e2e-aws-ovn openshift/cloud-provider-aws#117

@mtulio mtulio changed the title SPLAT-2253/WIP: CCM-AWS config enforce to provision Service NLB with SG under gate SPLAT-2253: CCM-AWS config enforce to provision Service NLB with SG under gate Oct 29, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 29, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 29, 2025

@mtulio: This pull request references SPLAT-2253 which is a valid jira issue.

In response to this:

Bumping cloud-provider-aws to gather the NLB+SG feature, enabling the configuration to provision SGs for all NLBs through the sync transformer.

Ref: openshift/cloud-provider-aws#117

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Update kubernetes/cloud-provider-aws lib to use latest support of Service
type-loadBalancer NLB with support of Security Groups.

Also update the openshift clients with support of kube 1.34.
Update kubernetes/cloud-provider-aws lib to use latest support of Service
type-loadBalancer NLB with support of Security Groups.
@mtulio
Copy link
Contributor Author

mtulio commented Oct 29, 2025

/test all

@mtulio
Copy link
Contributor Author

mtulio commented Oct 29, 2025

/testwith openshift/origin/main/e2e-aws-ovn openshift/cloud-provider-aws#117

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 29, 2025

@mtulio, testwith: could not generate prow job. ERROR:

No ref for requested test included in command. The org, repo, and branch containing the requested test need to be targeted by at least one of the included PRs.

@mtulio
Copy link
Contributor Author

mtulio commented Oct 29, 2025

/test ?

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 29, 2025

@mtulio: The following commands are available to trigger required jobs:

/test e2e-aws-ovn
/test e2e-aws-ovn-upgrade
/test fmt
/test images
/test lint
/test okd-scos-images
/test unit
/test vendor
/test verify-deps
/test vet

The following commands are available to trigger optional jobs:

/test e2e-aws-ovn-techpreview
/test e2e-azure-manual-oidc
/test e2e-azure-ovn
/test e2e-azure-ovn-upgrade
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-ibmcloud-ovn
/test e2e-nutanix-ovn
/test e2e-openstack-ovn
/test e2e-vsphere-ovn
/test level0-clusterinfra-azure-ipi-proxy-tests
/test okd-scos-e2e-aws-ovn
/test regression-clusterinfra-vsphere-ipi-ccm

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-cluster-cloud-controller-manager-operator-main-e2e-aws-ovn
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-e2e-aws-ovn-upgrade
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-fmt
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-images
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-level0-clusterinfra-azure-ipi-proxy-tests
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-lint
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-okd-scos-e2e-aws-ovn
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-okd-scos-images
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-regression-clusterinfra-vsphere-ipi-ccm
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-unit
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-vendor
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-verify-deps
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-vet

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mtulio
Copy link
Contributor Author

mtulio commented Oct 29, 2025

Checking TP job while investigating unit:
/test e2e-aws-ovn-techpreview

Enforce CCM to manage Security Group by default for
security compliance and best practices on Service type-loadBalancer
when using Network Load Balancer (NLB).
@mtulio
Copy link
Contributor Author

mtulio commented Oct 29, 2025

Still investigating to fix the units due FG, but the TP job e2e-aws-ovn-techpreview is reporting with the SG attached, meaning the cloud-config is correctly enforced, and CCM is managing SG as expected:

{
  "DNSName": "addfb39f8b35849e5b48c04c5a0b5519-cb4f4d6aa976dd4b.elb.us-east-1.amazonaws.com",
  "CreatedTime": "2025-10-29T18:26:11.543000+00:00",
  "LoadBalancerName": "addfb39f8b35849e5b48c04c5a0b5519",
  "State": {
    "Code": "active"
  },
  "Type": "network",
  "AvailabilityZones": [
    {
      "ZoneName": "us-east-1b",
      "SubnetId": "subnet-044987033e04b1f39",
      "LoadBalancerAddresses": []
    },
    {
      "ZoneName": "us-east-1f",
      "SubnetId": "subnet-0a188e6925f2d9c80",
      "LoadBalancerAddresses": []
    }
  ],
  "SecurityGroups": [
    "sg-068e4c01c3183b2d5"
  ],
  "IpAddressType": "ipv4",
  "Scheme": "internet-facing"
}

@mtulio
Copy link
Contributor Author

mtulio commented Oct 29, 2025

Unit is ok locally:

/test unit

@mtulio
Copy link
Contributor Author

mtulio commented Oct 29, 2025

verify and vendor are also green. Let me convert to ready for review and trigger the option TP job again followed by OTE (test with CCM and Origin - next comment):

/test e2e-aws-ovn-techpreview

@mtulio mtulio marked this pull request as ready for review October 29, 2025 20:49
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 29, 2025
@openshift-ci openshift-ci bot requested review from chrischdi and racheljpg October 29, 2025 20:49
@mtulio
Copy link
Contributor Author

mtulio commented Oct 29, 2025

/testwith openshift/cluster-cloud-controller-manager-operator/main/e2e-aws-ovn-techpreview openshift/cloud-provider-aws#117 openshift/origin#30235

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 30, 2025

@mtulio: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/verify-deps 31e89ae link true /test verify-deps
ci/prow/vendor 31e89ae link true /test vendor
ci/prow/unit 31e89ae link true /test unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 30, 2025

@mtulio: This pull request references SPLAT-2253 which is a valid jira issue.

In response to this:

Updating the cloud-provider-aws and OpenShift clients to gather the NLB+SG feature, enabling the configuration to provision SGs for all NLBs through the sync transformer.

Ref: openshift/cloud-provider-aws#117

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio
Copy link
Contributor Author

mtulio commented Oct 30, 2025

Upgrading from k8s 1.33 to 1.34 introduced JSON marshaling behavior changed, and looks like when updating openshift clients it is hitting the unit tests resourceapply to fail when calculating the hash of object.

Considering this would be unrelated with changes introduced to this PR, I will open a different thread to discuss the correct approach. As for now my view is this is blocking this PR as it requires to update cloud-provider-aws to 1.34 (which requires o && k 1.34)

> The Problem
The spec-hash annotation is used in production to detect if a resource's spec has changed. Looking at the code:
- Change Detection: The hash is compared to determine if an update is needed
- Backward Compatibility: If the hash calculation changes between library versions, existing resources with old hashes will be incorrectly detected as "changed" and unnecessarily updated

> The Risk
When upgrading from 1.33 to 1.34:
- Existing resources in production will have hashes computed with the old JSON marshaling
- New code will compute different hashes for the same spec due to JSON marshaling changes
- This will cause unnecessary updates to all existing resources on first deployment after upgrade

cc @rvanderp3 @damdo

@mtulio
Copy link
Contributor Author

mtulio commented Nov 7, 2025

This PR is blocked by #428 where there will provide the bump as well fixes found in the unit tests.

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants