Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HOSTEDCP-2035: Use Client Cert Auth for ARO HCP deployments #156

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bryan-cox
Copy link
Member

@bryan-cox bryan-cox commented Oct 7, 2024

Use Client Certificate Authentication for ARO HCP deployments. HyperShift will pass the needed environment variables for this authentication method: ARO_HCP_MI_CLIENT_ID, ARO_HCP_TENANT_ID, and ARO_HCP_CLIENT_CERTIFICATE_PATH.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 7, 2024
Copy link
Contributor

openshift-ci bot commented Oct 7, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@bryan-cox bryan-cox force-pushed the HOSTEDCP-1994 branch 2 times, most recently from c8f4099 to dbc3e29 Compare October 9, 2024 18:35
@bryan-cox bryan-cox marked this pull request as ready for review October 9, 2024 18:35
@bryan-cox bryan-cox changed the title Refactor to use Azure SDK default cred chain HOSTEDCP-1994: Refactor to use Azure SDK default cred chain Oct 9, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 9, 2024
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 9, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 9, 2024

@bryan-cox: This pull request references HOSTEDCP-1994 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target only the "4.18.0" version, but multiple target versions were set.

In response to this:

Refactor to use the Azure SDK for Go's default credential chain function, NewDefaultAzureCredential.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bryan-cox
Copy link
Member Author

/retest

@bryan-cox bryan-cox marked this pull request as draft October 18, 2024 20:23
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 18, 2024
@bryan-cox bryan-cox changed the title HOSTEDCP-1994: Refactor to use Azure SDK default cred chain HOSTEDCP-2035: Refactor to use Azure SDK default cred chain Oct 19, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 19, 2024

@bryan-cox: This pull request references HOSTEDCP-2035 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target only the "4.18.0" version, but multiple target versions were set.

In response to this:

Refactor to use the Azure SDK for Go's default credential chain function, NewDefaultAzureCredential.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bryan-cox bryan-cox changed the title HOSTEDCP-2035: Refactor to use Azure SDK default cred chain HOSTEDCP-2035: Use Client Cert Auth for ARO HCP deployments Oct 19, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 19, 2024

@bryan-cox: This pull request references HOSTEDCP-2035 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target only the "4.18.0" version, but multiple target versions were set.

In response to this:

Use Client Certificate Authentication for ARO HCP deployments. HyperShift will pass the needed environment variables for this authentication method: ARO_HCP_MI_CLIENT_ID, ARO_HCP_TENANT_ID, and ARO_HCP_CLIENT_CERTIFICATE_PATH.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bryan-cox bryan-cox marked this pull request as ready for review October 19, 2024 19:45
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 19, 2024
@bryan-cox
Copy link
Member Author

/retest

3 similar comments
@bryan-cox
Copy link
Member Author

/retest

@bryan-cox
Copy link
Member Author

/retest

@bryan-cox
Copy link
Member Author

/retest

@bryan-cox
Copy link
Member Author

/test e2e-azure-ovn

@bryan-cox bryan-cox force-pushed the HOSTEDCP-1994 branch 2 times, most recently from 63df6d1 to bb903ac Compare October 24, 2024 11:45
@bryan-cox
Copy link
Member Author

/test e2e-gcp-ovn

pkg/cloudprovider/azure.go Outdated Show resolved Hide resolved
pkg/cloudprovider/azure.go Outdated Show resolved Hide resolved
pkg/cloudprovider/azure.go Outdated Show resolved Hide resolved
pkg/cloudprovider/azure.go Show resolved Hide resolved
pkg/cloudprovider/azure.go Outdated Show resolved Hide resolved
pkg/cloudprovider/azure.go Outdated Show resolved Hide resolved
pkg/cloudprovider/azure.go Outdated Show resolved Hide resolved
pkg/cloudprovider/azure.go Outdated Show resolved Hide resolved
pkg/cloudprovider/azure.go Outdated Show resolved Hide resolved
@bryan-cox
Copy link
Member Author

/test unit

@bryan-cox bryan-cox force-pushed the HOSTEDCP-1994 branch 4 times, most recently from 1eb3c6b to 63e9561 Compare November 1, 2024 23:45
@bryan-cox
Copy link
Member Author

/retest

Copy link
Contributor

@kyrtapz kyrtapz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not confident in the new approach, It causes a hard stop and has a potentially significant delay.

var fileContents []byte
var err error

watchCertificateFileOnce.Do(func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of watchCertificateFileOnce? Is it just to keep the initialFileHash intact?
If so it should certainly be documented and an error returned if the function was called more than once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add documentation.

I'm not sure how we can return on a second run of watchCertificateFileOnce.Do. This shouldn't be possible since that is the point of sync.Once.

pkg/filewatcher/filewatcher.go Outdated Show resolved Hide resolved
return
}

initialFileHash = hashSimple(fileContents)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason behind going with periodic hash check instead of a file watch?
With the current implementation there is a chance we are going to be using the old cert file for ~30min which seems unnecessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can certainly poll sooner than 30m for the file check. How quick would you like it to poll?

Copy link
Member Author

@bryan-cox bryan-cox Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for the periodic hash check - the previous file watcher code was not catching the changes when I was testing this function against the CPO. I could exec into the pod and see the file changed but the pod was not restarted nor did I see any messages about the file changing.

This current change seemed just as simple and required less code. This change was working when I was testing the same function against the CPO in openshift/hypershift#4997


done := make(chan bool)

go func() {
Copy link
Contributor

@kyrtapz kyrtapz Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you run checkForFileChanges as a go routine but it starts one inside with the done channel waiting for it, why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I didn't need the second goroutine so I removed it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently this is needed. The pod does not restart when I removed this.

klog.Infof("Checking file for changes, %s", fileToWatch)
fileContents, err := os.ReadFile(fileToWatch)
if err != nil {
klog.Error("failed to read the file: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a file read fails we will just stop watching it and if it changes this will never be caught.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return should just return from the sync.Once function and the error would be returned at the end of WatchFileForChanges

@bryan-cox bryan-cox force-pushed the HOSTEDCP-1994 branch 2 times, most recently from 0c47480 to a637648 Compare November 4, 2024 15:45
Copy link
Contributor

openshift-ci bot commented Nov 4, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bryan-cox
Once this PR has been reviewed and has the lgtm label, please ask for approval from kyrtapz. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@bryan-cox bryan-cox force-pushed the HOSTEDCP-1994 branch 3 times, most recently from 5be01e8 to ca29e0b Compare November 6, 2024 16:21
Use Client Certificate Authentication for ARO HCP deployments.
HyperShift will pass the needed environment variables for this
authentication method: ARO_HCP_MI_CLIENT_ID, ARO_HCP_TENANT_ID, and
ARO_HCP_CLIENT_CERTIFICATE_PATH.

Signed-off-by: Bryan Cox <[email protected]>
Copy link
Contributor

openshift-ci bot commented Nov 6, 2024

@bryan-cox: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/security 6ba6f9b link false /test security
ci/prow/e2e-openstack-ovn-serial-e2e-only 6ba6f9b link false /test e2e-openstack-ovn-serial-e2e-only

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants