-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFE-1133: Watch Infrastructure and update AWS tags #137
base: main
Are you sure you want to change the base?
Conversation
@chiragkyal: This pull request references CFE-1133 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
3d5641f
to
f413d74
Compare
/test images |
/retest-required |
/test all |
afad1c2
to
deb501d
Compare
@chiragkyal: This pull request references CFE-1133 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
c177a9c
to
48c649a
Compare
/retest-required |
- Added watch on Infrastructure object to detect changes in PlatformStatus.AWS.ResourceTags. - Merged tags from AWSLoadBalancerController.Spec.AdditionalResourceTags and PlatformStatus.AWS.ResourceTags, prioritizing operator spec tags. Signed-off-by: chiragkyal <[email protected]>
/retest |
/assign @alebedev87 |
/retest |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest-required |
54edf38
to
650dd1b
Compare
/retest-required |
/retest-required |
1 similar comment
/retest-required |
Signed-off-by: chiragkyal <[email protected]>
/retest-required |
managed, err := isManagedServiceCluster(context.TODO(), kubeClientSet) | ||
if err != nil { | ||
t.Fatalf("failed to check managed cluster %v", err) | ||
} | ||
if managed { | ||
t.Skip("Infrastructure status cannot be directly updated on managed cluster, skipping...") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Infrastructure status cannot be directly updated on ROSA clusters. Hence skipping the test.
func isManagedServiceCluster(ctx context.Context, adminClient kubernetes.Interface) (bool, error) { | ||
_, err := adminClient.CoreV1().Namespaces().Get(ctx, "openshift-backplane", v1.GetOptions{}) | ||
if err == nil { | ||
return true, nil | ||
} | ||
|
||
if !errors.IsNotFound(err) { | ||
return false, err | ||
} | ||
|
||
return false, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is inspired by https://github.com/openshift/origin/pull/29216/files#diff-35a89a7a7362642eebb559fb8564e857b00d6f7dd6322c3adabaf1adbd609d35R2267-R2278 implementation.
xRef: Slack discussion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add this link as a comment so that it would be easier to track this later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, added the PR link as code comment
} | ||
|
||
func desiredContainerArgs(controller *albo.AWSLoadBalancerController, clusterName, vpcID string) []string { | ||
func desiredContainerArgs(controller *albo.AWSLoadBalancerController, platformStatus *configv1.PlatformStatus, clusterName, vpcID string) ([]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cluster name and vpcID are still more fundamental inputs for the flags than the resource tags:
func desiredContainerArgs(controller *albo.AWSLoadBalancerController, platformStatus *configv1.PlatformStatus, clusterName, vpcID string) ([]string, error) { | |
func desiredContainerArgs(controller *albo.AWSLoadBalancerController, clusterName, vpcID string, platformStatus *configv1.PlatformStatus) ([]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Updated
platformStatus := infraConfig.Status.PlatformStatus | ||
if platformStatus == nil { | ||
return ctrl.Result{}, fmt.Errorf("failed to determine infrastructure platform status: status.platformStatus is nil") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the platform status ever be nil
? If it can, I don't think we have to block the reconciliation - we just don't add platform tags in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
platform status should not be nil
for clusters deployed on AWS. We need the AWS region as well from the platform status for the operator to function correctly.
aws-load-balancer-operator/main.go
Lines 230 to 233 in 59f0616
if infra.Status.PlatformStatus == nil || infra.Status.PlatformStatus.AWS == nil || infra.Status.PlatformStatus.AWS.Region == "" { | |
err = fmt.Errorf("could not get AWS region from Infrastructure %q status", clusterInfrastructureName) | |
return | |
} |
So, I think we should block the reconciliation if it is nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the AWS region as well from the platform status for the operator to function correctly.
Right, that's what we ensure at the startup of the operator. The region is fundamental for the operator, without it the operator cannot function. Unlike resource tags (platform or additional).
I don't see why a cluster may not have any platform resource tags. So, I consider platformStatus==nil
as the same use case as "customer didn't set any resource tags during installation".
// `--default-tags` arg allows a maximum of 24 user tags, but the combination of | ||
// controller.Spec.AdditionalResourceTags and platformStatus.AWS.ResourceTags can result in more. | ||
// Ensure a maximum of 24 merged tags are added. | ||
if len(tags) > 24 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be a constant? And can you please add some links which confirm this number. AWS limitation is 50 tags per resource so I don't quite get where 24
is coming from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
24
is coming from the validation added for the AdditionalResourceTags
field.
aws-load-balancer-operator/api/v1/awsloadbalancercontroller_types.go
Lines 61 to 77 in 59f0616
// additionalResourceTags are the AWS tags that will be applied to all AWS resources managed by this | |
// controller. The managed AWS resources don't include the cluster subnets which are tagged by the operator. | |
// The addition of new tags as well as the update or removal of any existing tags | |
// will be propagated to the AWS resources. The controller owns all the tags of the managed AWS resources, | |
// unsolicited tags are removed. The controller doesn't watch for changes on AWS, so the removal of the unsolicited | |
// tags can only be triggered by an event coming from OpenShift. AWS supports a maximum of 50 tags per resource. | |
// AWSLoadBalancerController reserves 3 tags for its use, the rest is split between the tag annotation | |
// which can be set on the ingress and this field: 23 and 24, respectively. Each tag key must be unique. | |
// | |
// +kubebuilder:validation:MaxItems=24 | |
// +kubebuilder:validation:Optional | |
// +optional | |
// +listType=map | |
// +listMapKey=key | |
// +patchMergeKey=key | |
// +patchStrategy=merge | |
AdditionalResourceTags []AWSResourceTag `json:"additionalResourceTags,omitempty" patchStrategy:"merge" patchMergeKey:"key"` |
Since we are now merging these tags with the Infrastructure tags, isn't this number still valid?
// Add tags from platformStatus.AWS.ResourceTags to the map, only if the key doesn't exist | ||
if platformStatus.AWS != nil && len(platformStatus.AWS.ResourceTags) > 0 { | ||
for _, t := range platformStatus.AWS.ResourceTags { | ||
if len(t.Key) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
platformStatus.AWS.ResourceTags.Key
has OpenAPI schema validations which exclude empty value:
// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:MaxLength=128
// +kubebuilder:validation:Pattern=`^[0-9A-Za-z_.:/=+-@]+$`
So, this is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch. Removed the if block.
if tc.platformStatus == nil { | ||
tc.platformStatus = &configv1.PlatformStatus{} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should handle the nil
case inside the function. That is, if platformStatus
is nil
we just don't add platform tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to #137 (comment) reason, I think having nil
platformStatus
won't even start the operator.
@@ -854,11 +962,15 @@ func TestEnsureDeploymentEnvVars(t *testing.T) { | |||
VPCID: "test-vpc", | |||
AWSRegion: testAWSRegion, | |||
} | |||
_, err := r.ensureDeployment(context.Background(), tc.serviceAccount, "test-credentials", "test-serving", tc.controller, nil) | |||
_, err := r.ensureDeployment(context.Background(), tc.serviceAccount, "test-credentials", "test-serving", tc.controller, &configv1.PlatformStatus{}, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about nil
value being handled inside the function as other args of this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the same reason, I think we won't even reach upto ensureDeployment
if platformStatus
is nil
.
test/e2e/operator_test.go
Outdated
alb.Spec.AdditionalResourceTags = []albo.AWSResourceTag{ | ||
{Key: "op-key1", Value: "op-value1"}, | ||
{Key: "conflict-key1", Value: "op-value2"}, | ||
{Key: "conflict-key2", Value: "op-value3"}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add this to albcBuilder
? Something like withTags(map[string]string)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just now observed that there is already a similar function: withResourceTags(map[string]string)
. I've updated it to use this function instead. Thanks for the suggestion.
|
||
// Check `--default-tags` arg for tags present in alb instance and infra status (initialInfraTags) | ||
expectedTagValue := "conflict-key1=op-value2,conflict-key2=op-value3,op-key1=op-value1,plat-key1=plat-value1" | ||
assertContainerArgFromDeployment(t, dep, awsLoadBalancerControllerContainerName, "--default-tags", expectedTagValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is not quite different from the unit tests for desired deployment. We already assure the deployment gets the right flag values. I think the e2e test should check the the result of ALBC's job - load balancers. You can use https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/elasticloadbalancingv2 to check the infrastructure tags are propagated to a load balancer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking differently. The task of the ALBO is to set the flags correctly, next it's the job of ALBC to consume those flags and take necessary actions (adding tags in this case). If we can validate that the --default-tags
flag is getting populated correctly, I hope we can trust ALBC to propagate them to a load balancer.
I would expect ALBC's e2e to query the tags using https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/elasticloadbalancingv2
In case of CIO also, we did a similar check. We're checking whether the operator has correctly set the annotations or not. https://github.com/openshift/cluster-ingress-operator/pull/1148/files#diff-cf4b4e5424070a666a364d1d1b04011478d888d6d3d65c943ca7783333b7a6e4R1356-R1358
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The task of the ALBO is to set the flags correctly, next it's the job of ALBC to consume those flags and take necessary actions (adding tags in this case). If we can validate that the --default-tags flag is getting populated correctly, I hope we can trust ALBC to propagate them to a load balancer.
We don't have much control over what's validated upstream. The e2e tests for our addon operators verify use cases we think are nominal for our customers. And the idea here is to go from one end (API) to the other end (load balancers). That's why all the other e2e tests send real traffic to check the load balancers are in the shape we expect them to be. The validation of the expected values for --default-tags
flag is what a unit test can do even without envtest
, a fake Kube client would be enough.
I would expect ALBC's e2e to query the tags using https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/elasticloadbalancingv2
I would expect too. But I don't think we know this for sure.
In case of CIO also, we did a similar check. We're checking whether the operator has correctly set the annotations or not. https://github.com/openshift/cluster-ingress-operator/pull/1148/files#diff-cf4b4e5424070a666a364d1d1b04011478d888d6d3d65c943ca7783333b7a6e4R1356-R1358
C-I-O case is different. The creation of load balancers for Service
s is out of control of C-I-O, it's not done by anything C-I-O manages or being responsible for.
Frankly speaking, I would be ok if this feature would come with a set of unit tests only. However since you already into the e2e stage, we should be consistent with the rest of the tests and go to very end (AWS).
func isManagedServiceCluster(ctx context.Context, adminClient kubernetes.Interface) (bool, error) { | ||
_, err := adminClient.CoreV1().Namespaces().Get(ctx, "openshift-backplane", v1.GetOptions{}) | ||
if err == nil { | ||
return true, nil | ||
} | ||
|
||
if !errors.IsNotFound(err) { | ||
return false, err | ||
} | ||
|
||
return false, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add this link as a comment so that it would be easier to track this later?
Signed-off-by: chiragkyal <[email protected]>
@chiragkyal: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This PR adds support for AWS user tags specified in the platform status. Tags can be managed centrally in the
PlatformStatus
and have them automatically applied to resources managed by the ALBC.Infrastructure
object to detect changes in theplatformStatus.AWS.ResourceTags
field.AWSLoadBalancerController.Spec.AdditionalResourceTags
and thePlatformStatus.AWS.ResourceTags
into a single list, prioritizing tags from the operator spec.--default-tags
argument.Implements CFE-1133