Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix certificate refresh and add e2e tests #766

Merged
merged 4 commits into from
Nov 6, 2024

Conversation

petrutlucian94
Copy link
Contributor

There are few issues that affect the k8s refresh-certs command:

  • auto-approving CSR requests does not work. K8sd denies the requests since the CSR signature is missing.
  • in case of worker nodes, the specified certificate expiry date and additional Subject Alternative Names are ignored.

This PR addresses these issues and adds an e2e test that refreshes control-plane and worker node certificates, validating the resulting certificates.

@petrutlucian94 petrutlucian94 requested a review from a team as a code owner October 30, 2024 13:47
We're adding an integration test that refreshes the certificates
for control plane nodes as well as worker nodes. It then verifies
that the new certificates have the requested expiry date and
additional Subject Alternative Names.

For simplicity, we'll configure the cluster to automatically
approve CSR requests.

While at it, we're adding E231 and E226 to the flake8 ignore list
since the linter cannot properly handle format strings.
The cluster can be configured to automatically approve certificate
sign requests that are issued when refeshing cluster certificates.

However, k8sd rejects the CSR requests since it expects them to
include a signature that is currently missing.

We'll address the problem by adding the missing CSR signature.

Note that the CSR signature is passed through k8s annotations and
thus needs to be base64 encoded. We're updating the unit tests
accordingly.
@petrutlucian94 petrutlucian94 force-pushed the cert_tests branch 2 times, most recently from 2cae419 to b3dd7ca Compare October 31, 2024 09:50
@petrutlucian94
Copy link
Contributor Author

petrutlucian94 commented Oct 31, 2024

The ubuntu 20.04 job fails as the test is unable to retrieve the certificate file:

______________________________ test_cert_refresh _______________________________
Traceback (most recent call last):
  File "/home/runner/work/k8s-snap/k8s-snap/tests/integration/tests/test_util/harness/lxd.py", line 209, in pull_file
    run(
  File "/home/runner/work/k8s-snap/k8s-snap/tests/integration/tests/test_util/util.py", line 37, in run
    return subprocess.run(command, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['lxc', 'file', 'pull', 'k8s-integration-8a6c48-19/etc/kubernetes/pki/kubelet.crt', '/tmp/tmpyxckz3ta']' returned non-zero exit status 1.

It fails consistently on 20.04, is the lxc file pull command broken or unsupported on 20.04? FWIW, this new test is the only one that retrieves files from the harness instance.

It's either that or the strict mode. I'll take a closer look.

LE: looks like with the strict patches, the k8s certs are located in /var/snap/k8s, so
/etc/kubernetes/pki/kubelet.crt becomes /var/snap/k8s/common/etc/kubernetes/pki/kubelet.crt. I'll update the test to check both paths.

The k8s cert dir may be either /etc/kubernetes/pki or
/var/snap/k8s/etc/kubernetes/pki. We'll need to update our
integration test to check both locations.
The requested expiry date and extra SANs are currently ignored
when refreshing worker node certificates. There's a TODO and
a hard-coded 10y expiry date.

This commit ensures that the specified expiry date and Subject
Alternative Name are properly passed and applied.
Copy link
Contributor

@bschimke95 bschimke95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bschimke95 bschimke95 merged commit 13875ac into canonical:main Nov 6, 2024
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants