Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override Ceph version #2574

Merged

Conversation

fmount
Copy link
Contributor

@fmount fmount commented Nov 27, 2024

We used to support N+1 server version with external Ceph, and this policy is still adopted.
For this reason we need a way to do early testing in CI.
This patch adds the ability to override the CentOS Ceph repository and
install the target cephadm release passed as input.

Jira: https://issues.redhat.com/browse/OSPRH-10666

@fmount fmount requested a review from tosky November 27, 2024 13:23
Copy link

Thanks for the PR! ❤️
I'm marking it as a draft, once your happy with it merging and the PR is passing CI, click the "Ready for review" button below.

@github-actions github-actions bot marked this pull request as draft November 27, 2024 13:23
@fmount fmount force-pushed the test_squid branch 2 times, most recently from de0ac66 to 9d1f1cf Compare November 28, 2024 16:07
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/0519f4329a914189ae7f5384c6390013

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 04m 20s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 19m 13s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 26m 26s
podified-multinode-hci-deployment-crc FAILURE in 58m 28s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 59s
✔️ cifmw-pod-pre-commit SUCCESS in 6m 58s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 34s
✔️ build-push-container-cifmw-client SUCCESS in 37m 18s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 4m 19s

@fmount fmount force-pushed the test_squid branch 2 times, most recently from d437a63 to a83a33e Compare November 28, 2024 21:04
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/51e82232d4f34df5a6eb74ec7a7e5c14

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 30m 34s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 17m 28s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 16m 18s
podified-multinode-hci-deployment-crc FAILURE in 58m 13s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 27s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 08s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 13s
✔️ build-push-container-cifmw-client SUCCESS in 21m 43s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 5m 07s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/68ae94ace71b4143bd4c78b6529f961d

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 30m 42s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 15m 28s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 18m 24s
podified-multinode-hci-deployment-crc FAILURE in 59m 57s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 17s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 17s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 27s
✔️ build-push-container-cifmw-client SUCCESS in 38m 15s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 5m 10s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/4278fce779f24a239cc8073ca3980e52

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 43m 15s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 19m 28s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 31m 11s
podified-multinode-hci-deployment-crc FAILURE in 58m 23s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 39s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 40s
✔️ cifmw-pod-zuul-files SUCCESS in 3m 53s
✔️ build-push-container-cifmw-client SUCCESS in 36m 45s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 4m 26s

@fmount fmount force-pushed the test_squid branch 4 times, most recently from c1777aa to 8427cbc Compare November 29, 2024 15:50
@@ -55,7 +55,9 @@
{% if not cifmw_cephadm_default_container %}--image {{ cifmw_cephadm_container_ns + '/' + cifmw_cephadm_container_image + ':' + cifmw_cephadm_container_tag|string }} \{% endif %}
bootstrap \
--skip-firewalld \
{% if not cifmw_cephadm_prepare_host %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fultonj looks like from Squid+ we don't need to --skip-prepare-host, as we already have everything in place when we reach this point (network, lvdevices, podman) [1].
I would try a testproject with this change to make sure we don't hit any side effect.

[1] https://github.com/ceph/ceph/blob/squid-release/src/cephadm/cephadm.py#L4494

- cifmw_cephadm_repository_override | bool
become: true
ansible.builtin.dnf:
name: centos-release-ceph-{{ cifmw_cephadm_version }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dlrn embeds the Centos-Storage-Sig repos, and by installing this package we can simply override the Ceph related sections, bumping the version to a newer release.

@@ -343,6 +343,15 @@
dashboard_enabled: true
cephfs_enabled: true
ceph_nfs_enabled: true
# Override the Ceph container image and deploy Squid
cifmw_cephadm_container_ns: "quay.io/ceph"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fultonj @tosky I was thinking to switch this multinode-hci-edpm to Ceph Squid, so we have a way to do early testing upstream.

  1. Does it make sense to do this here? It would help to perform a deployment sanity check, like we are going to do in devstack (but in that case I'm not sure we're going to backport squid to Antelope)
  2. If you're positive about Squid, do we want to add here a stable tag (e.g. v19.2) to avoid testing the latest build?

I can simply remove this part here and only target something downstream. I wanted to get your opinion and feedback first because I'm potentially going to do the same thing in install_yamls [1] and target this new Ceph versions for tempest jobs w/ Glance, Cinder and Manila.

[1] openstack-k8s-operators/install_yamls#952

@fmount fmount changed the title DNM - Test - Bump to Squid Bump Ceph to Squid Dec 2, 2024
@fmount fmount changed the title Bump Ceph to Squid Override Ceph version Dec 2, 2024
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/a7e7229850874053b83bcd2f133c10c7

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 56m 02s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 25m 34s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 39m 26s
podified-multinode-hci-deployment-crc RETRY_LIMIT in 6m 02s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 14s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 39s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 32s
✔️ build-push-container-cifmw-client SUCCESS in 21m 08s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 8m 04s

@fmount
Copy link
Contributor Author

fmount commented Dec 2, 2024

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/4fa7d899abfb4875a33621fcd72cd591

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 20m 35s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 23m 00s
cifmw-crc-podified-edpm-baremetal FAILURE in 47m 22s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 40m 40s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 57s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 22s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 08s
✔️ build-push-container-cifmw-client SUCCESS in 21m 34s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 5m 45s

@fultonj
Copy link
Contributor

fultonj commented Dec 2, 2024

The failure in RDO is not related to this patch.

can't read /home/zuul/ci-framework-data/artifacts/manifests/openstack/nncp/cr/*_nncp.yaml: No such file or directory

https://softwarefactory-project.io/zuul/t/rdoproject.org/build/c4f534a1c131475082e25256078ef1d6

RDO is green otherwise, in particular podified-multinode-hci-deployment-crc which uses ceph is green.

Thus, I'm +2 to merge this.

Though the unrelated NNCP issue seems to be blocking us.

@fmount fmount marked this pull request as ready for review December 2, 2024 15:59
@fmount
Copy link
Contributor Author

fmount commented Dec 2, 2024

recheck

@fmount
Copy link
Contributor Author

fmount commented Dec 2, 2024

we should setup a testproject to make sure we don't hit any side effect. We shouldn't but we can take some time to do some extra testing.

@fmount
Copy link
Contributor Author

fmount commented Dec 4, 2024

@fultonj from internal testing looks good and we can land this patch if you're ok with it.

@@ -343,6 +343,15 @@
dashboard_enabled: true
cephfs_enabled: true
ceph_nfs_enabled: true
# Override the Ceph container image and deploy Squid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need the values here that are matching the defaults? Not really agains it but this means that we need to touch a few places to bump the version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm I wanted a place where we can have a well defined list of parameters that allow us to test the next version.
For this reason I can revert back the defaults under defaults/main.yaml to match Reef, and leave here, at job level, the overrides to install Squid.
How does it sounds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the meantime, so you can check what I had in mind.

We used to support N+1 server version with external Ceph, and this policy is
still adopted. For this reason we need a way to do early testing in CI.
This patch adds the ability to override the CentOS Ceph repository and
install the target cephadm release passed as input.

Signed-off-by: Francesco Pantano <[email protected]>
Copy link
Contributor

@fultonj fultonj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

Copy link
Contributor

openshift-ci bot commented Dec 4, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fultonj

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Dec 4, 2024
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/4f10fa0c37784564978af0478a85cd5e

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 11m 45s
podified-multinode-edpm-deployment-crc RETRY_LIMIT in 10m 17s
cifmw-crc-podified-edpm-baremetal RETRY_LIMIT in 26m 31s
podified-multinode-hci-deployment-crc POST_FAILURE in 1h 43m 25s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 19s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 22s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 20s
✔️ build-push-container-cifmw-client SUCCESS in 37m 58s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 4m 36s

@fmount
Copy link
Contributor Author

fmount commented Dec 4, 2024

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/0941d1757fa148afbe2856c86dcb075f

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 18m 26s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 19m 46s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 36m 05s
podified-multinode-hci-deployment-crc RETRY_LIMIT in 21m 33s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 04s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 25s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 12s
✔️ build-push-container-cifmw-client SUCCESS in 36m 35s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 4m 21s

@fmount
Copy link
Contributor Author

fmount commented Dec 4, 2024

recheck

@openshift-merge-bot openshift-merge-bot bot merged commit 49034ad into openstack-k8s-operators:main Dec 4, 2024
4 checks passed
@fmount
Copy link
Contributor Author

fmount commented Jan 9, 2025

/cherry-pick 18.0-fr1

@openshift-cherrypick-robot

@fmount: new pull request created: #2641

In response to this:

/cherry-pick 18.0-fr1

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants