Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TLS] TLS for public endpoints terminated at a route #502

Merged
merged 3 commits into from
Oct 26, 2023

Conversation

stuggi
Copy link
Contributor

@stuggi stuggi commented Oct 2, 2023

Changes openstacklient

  • CRD to allows to pass in CA secret
  • passed in CA secret to validate endpoint certs.

Adds CRD parameters to configure TLS for public and internal TLS.

  • per default self signed root CA + issuer get created for public and internal certs
  • via the apiOverride.TLS of a service, a secret with cert, key and CA cert can be provided to use instead of the default self signed
  • user can provide a CA secret for certs to be added to the combined CA secret the openstack-operator creates to pass into services / openstackclient
  • refactors the current route create for followup on TLS-E to create certs for each service endpoint.
  • when TLS for public endpoint is enabled a Cert for the route gets automatically created and added to the route CR.
  • the openstack-operator creates a full tls-ca-bundle.pem using the operator image ca-bundle as base and adds the public, internal and user provided CAs to it. This allows to mount a full tls-ca-bundle.pem into the deployment pod and don't have to rely on kolla to run update-ca-trust which requires container to run as root.

TODO:

  • enhance envtest coverage, right now just really basic testing

Jira: OSP-26299

Depends-On: openstack-k8s-operators/lib-common#351
Depends-On: openstack-k8s-operators/keystone-operator#318
Depends-On: openstack-k8s-operators/tcib#82

@stuggi
Copy link
Contributor Author

stuggi commented Oct 2, 2023

/hold

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/0f1f5f8f3a1d49ffa13b44610d929572

openstack-k8s-operators-content-provider FAILURE in 13m 49s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@stuggi stuggi force-pushed the tls_public_endpoint branch 3 times, most recently from 7971a8c to 5504358 Compare October 4, 2023 07:10
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/38ea081cf999428fa0d44a59c372ac92

✔️ openstack-k8s-operators-content-provider SUCCESS in 58m 50s
podified-multinode-edpm-deployment-crc FAILURE in 41m 25s
cifmw-crc-podified-edpm-baremetal FAILURE in 45m 46s

@stuggi
Copy link
Contributor Author

stuggi commented Oct 4, 2023

@stuggi stuggi requested review from Deydra71 and removed request for lewisdenny October 4, 2023 08:39
@stuggi stuggi force-pushed the tls_public_endpoint branch from 5504358 to b9f2e38 Compare October 4, 2023 14:52
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/5b1fab80098242e6b537cfde00e82105

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 02m 02s
podified-multinode-edpm-deployment-crc FAILURE in 46m 44s
cifmw-crc-podified-edpm-baremetal FAILURE in 48m 36s

@stuggi stuggi requested a review from vakwetu October 5, 2023 08:27
@stuggi stuggi force-pushed the tls_public_endpoint branch from b9f2e38 to bb5f504 Compare October 5, 2023 10:22
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/99ca36e8b1af4de8971657a1566951cd

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 08m 03s
podified-multinode-edpm-deployment-crc FAILURE in 45m 19s
cifmw-crc-podified-edpm-baremetal FAILURE in 49m 01s

@stuggi
Copy link
Contributor Author

stuggi commented Oct 5, 2023

FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (15 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (14 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (13 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (12 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (11 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (10 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (9 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (8 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (7 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (6 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (5 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (4 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (3 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (2 retries left).
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (1 retries left).
fatal: [localhost]: FAILED! => {"attempts": 20, "changed": false, "elapsed": 0, "msg": "Status code was -1 and not [200, 300, 301, 302, 401, 402, 403]: Request failed: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1129)>", "redirected": false, "status": -1, "url": "[https://keystone-public-openstack.apps-crc.testing"}](https://keystone-public-openstack.apps-crc.testing%22%7D/)

openstack-k8s-operators/ci-framework#639 to ignore ca validation for that task as it is only used to check if keystone is up

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/ec9d053e3b9640cd8d357aeada3c342e

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 05m 19s
podified-multinode-edpm-deployment-crc FAILURE in 45m 02s
cifmw-crc-podified-edpm-baremetal FAILURE in 51m 01s

@stuggi stuggi force-pushed the tls_public_endpoint branch from 437dd74 to f21d427 Compare October 6, 2023 05:02
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/108c66420f3649b386948d4056ae5d10

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 19m 10s
podified-multinode-edpm-deployment-crc FAILURE in 53m 45s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 04m 10s

@stuggi
Copy link
Contributor Author

stuggi commented Oct 6, 2023

recheck

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/087f2db8a8ba472eb844269b5a885fd3

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 18m 11s
podified-multinode-edpm-deployment-crc FAILURE in 1h 00m 41s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 59m 31s

@stuggi stuggi force-pushed the tls_public_endpoint branch from f21d427 to 35b6926 Compare October 6, 2023 15:55
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/7e15cec698524b28878c47b8a4957b13

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 23m 31s
podified-multinode-edpm-deployment-crc FAILURE in 1h 02m 35s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 05m 56s

@stuggi stuggi force-pushed the tls_public_endpoint branch from 35b6926 to 13efc5b Compare October 9, 2023 11:59
@stuggi
Copy link
Contributor Author

stuggi commented Oct 23, 2023

/test openstack-operator-build-deploy-kuttl

@stuggi
Copy link
Contributor Author

stuggi commented Oct 24, 2023

Enable TLS for public endpoints fail right now in kuttl test with cert validation on these commands:

+ openstack token issue
Failed to discover available identity versions when contacting https://keystone-public-openstack.apps.oko-03.oooci.ccitredhat.com/. Attempting to parse version from URL.
Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. SSL exception connecting to https://keystone-public-openstack.apps.oko-03.oooci.ccitredhat.com/: HTTPSConnectionPool(host='keystone-public-openstack.apps.oko-03.oooci.ccitredhat.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1129)')))

submitted openshift/release#44757

// +kubebuilder:validation:Optional
// +operator-sdk:csv:customresourcedefinitions:type=spec
// Issuer - cert-manager issuer to be used for the endpoint type. If not specified a self signed will be created.
Issuer *string `json:"issuer,omitempty"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olliewalsh @vakwetu @Deydra71 should we change this to be just a secret name the user should provide, instead of an issuer to be used? I think thats what we discussed in one of our meetings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed with @olliewalsh @Deydra71 on slack, I'll change the PR to not have a named issuer as parameter and instead a named secret which must contain public/private key

Copy link
Contributor Author

@stuggi stuggi Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is now changed in the latest version. Per service in the apiOverride you can specify in the apiOverride.tls.secretName the secret which must hold the tls.key, tls.crt and ca.crt . if the secret does not exist, or does not have those entries in the data, the operator logs an error.

  keystone:
    apiOverride:
      route: {}
      tls:
        secretName: mycustomkeystonecerts

@stuggi stuggi force-pushed the tls_public_endpoint branch from 0617dd0 to cb87b6f Compare October 24, 2023 14:21
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/18fe0d4bbf254cf6abd16256967c08bf

openstack-k8s-operators-content-provider FAILURE in 5m 16s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@stuggi stuggi force-pushed the tls_public_endpoint branch from cb87b6f to 5e86f4b Compare October 24, 2023 14:33
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/1128226fbf164775bf62b22c6ea9b11d

openstack-k8s-operators-content-provider FAILURE in 5m 03s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@stuggi
Copy link
Contributor Author

stuggi commented Oct 24, 2023

recheck

@stuggi stuggi force-pushed the tls_public_endpoint branch from 5e86f4b to 23e9a8d Compare October 24, 2023 15:44
@Deydra71
Copy link
Contributor

It is all good and clear for me. Let's wait for reviews of others.

@stuggi stuggi force-pushed the tls_public_endpoint branch 2 times, most recently from e3046f3 to c1aee56 Compare October 25, 2023 14:24
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/a751381415c045e5bb66f7bcb7370105

openstack-k8s-operators-content-provider FAILURE in 8m 19s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

Changes openstacklient
* CRD to allows to pass in CA secret
* mounts the ca bundle under /etc/pki

Adds CRD parameters to configure TLS for public and internal TLS.
* per default self signed root CA + issuer get created for
  public and internal certs
* via the apiOverride.TLS of a service, a secret with cert, key and
  CA cert can be provided to use instead of the default self signed
* user can provide a CA secret for certs to be added to the combined
  CA secret the openstack-operator creates to pass into services /
  openstackclient
* refactors the current route create for followup on TLS-E to create
  certs for each service endpoint.
* when TLS for public endpoint is enabled (default) a Cert for the
  route gets automatically created and added to the route CR.
* the openstack-operator creates a full tls-ca-bundle.pem using
  the operator image ca-bundle as base and adds the public, internal
  and user provided CAs to it. This allows to mount a full tls-ca-bundle.pem
  into the deployment pod and don't have to rely on kolla to run
  update-ca-trust which requires container to run as root.

Jira: OSP-26299

Depends-On: openstack-k8s-operators/lib-common#351
Depends-On: openstack-k8s-operators/keystone-operator#318
Depends-On: openstack-k8s-operators/tcib#82
@stuggi stuggi force-pushed the tls_public_endpoint branch from c1aee56 to 094241c Compare October 25, 2023 14:53
@stuggi
Copy link
Contributor Author

stuggi commented Oct 26, 2023

/unhold

TLS could be disabled using

  tls:
    endpoint:
      internal:
        enabled: false
      public:
        enabled: false

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 26, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Deydra71, stuggi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot merged commit 757a626 into openstack-k8s-operators:main Oct 26, 2023
1 check passed
cjeanner added a commit to openstack-k8s-operators/ci-framework that referenced this pull request Oct 27, 2023
This reverts commit 00e8d24.

We've seen constant failures in a CI job, linked to certificate
validation:
https://review.rdoproject.org/zuul/builds?job_name=podified-multinode-edpm-e2e-nobuild-tagged-crc&project=openstack-k8s-operators/ci-framework

An example:
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (1 retries left).
fatal: [localhost]: FAILED! => {"attempts": 20, "changed": false, "elapsed": 0, "msg": "Status code was -1 and not [200, 300, 301, 302, 401, 402, 403]: Request failed: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1129)>", "redirected": false, "status": -1, "url": "https://keystone-public-openstack.apps-crc.testing"}

Reverting the CA verification toggle patch seems the safest way to get
back to a green CI.

Note: the new failure may be related to a recent patch:
openstack-k8s-operators/openstack-operator#502
cjeanner added a commit to openstack-k8s-operators/ci-framework that referenced this pull request Oct 27, 2023
This is a *partial* revert of 00e8d24.

We've seen constant failures in a CI job, linked to certificate
validation:
https://review.rdoproject.org/zuul/builds?job_name=podified-multinode-edpm-e2e-nobuild-tagged-crc&project=openstack-k8s-operators/ci-framework

An example:
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (1 retries left).
fatal: [localhost]: FAILED! => {"attempts": 20, "changed": false, "elapsed": 0, "msg": "Status code was -1 and not [200, 300, 301, 302, 401, 402, 403]: Request failed: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1129)>", "redirected": false, "status": -1, "url": "https://keystone-public-openstack.apps-crc.testing"}

Reverting the CA verification toggle patch seems the safest way to get
back to a green CI.
It's still supposed to fetch and install the CA at this point.

We're seeing tempest failures when we do a complete reverse.

Note: the new failure may be related to a recent patch:
openstack-k8s-operators/openstack-operator#502
cjeanner added a commit to openstack-k8s-operators/ci-framework that referenced this pull request Oct 27, 2023
This is a *partial* revert of 00e8d24.

We've seen constant failures in a CI job, linked to certificate
validation:
https://review.rdoproject.org/zuul/builds?job_name=podified-multinode-edpm-e2e-nobuild-tagged-crc&project=openstack-k8s-operators/ci-framework

An example:
FAILED - RETRYING: [localhost]: Wait for keystone endpoint to exist in DNS (1 retries left).
fatal: [localhost]: FAILED! => {"attempts": 20, "changed": false, "elapsed": 0, "msg": "Status code was -1 and not [200, 300, 301, 302, 401, 402, 403]: Request failed: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1129)>", "redirected": false, "status": -1, "url": "https://keystone-public-openstack.apps-crc.testing"}

Reverting the CA verification toggle patch seems the safest way to get
back to a green CI.
It's still supposed to fetch and install the CA at this point.

We're seeing tempest failures when we do a complete reverse.

Note: the new failure may be related to a recent patch:
openstack-k8s-operators/openstack-operator#502
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants