OCPEDGE-2084: Add PacemakerStatus CRD for two-node fencing #2544

jaypoulz · 2025-10-21T18:19:04Z

Introduces tnf.etcd.openshift.io/v1alpha1 API group with PacemakerStatus custom resource. This provides visibility into Pacemaker cluster health for dual-replica etcd deployments. The status-only resource is populated by a privileged controller and consumed by the cluster-etcd-operator healthcheck controller. Not gated because it's only used by CEO when two-node has transitioned.

Works in conjunction with openshift/cluster-etcd-operator#1487

openshift-ci-robot · 2025-10-21T18:19:09Z

@jaypoulz: This pull request references OCPEDGE-2084 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

Introduces tnf.etcd.openshift.io/v1alpha1 API group with PacemakerStatus custom resource. This provides visibility into Pacemaker cluster health for dual-replica etcd deployments. The status-only resource is populated by a privileged controller and consumed by the cluster-etcd-operator healthcheck controller. Gated by DualReplica feature and managed by two-node-fencing component.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2025-10-21T18:19:13Z

Hello @jaypoulz! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

openshift-ci-robot · 2025-10-21T18:21:22Z

@jaypoulz: This pull request references OCPEDGE-2084 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

Introduces tnf.etcd.openshift.io/v1alpha1 API group with PacemakerStatus custom resource. This provides visibility into Pacemaker cluster health for dual-replica etcd deployments. The status-only resource is populated by a privileged controller and consumed by the cluster-etcd-operator healthcheck controller. Gated by DualReplica feature and managed by two-node-fencing component.

Works in conjunction with openshift/cluster-etcd-operator#1487

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

saschagrunert · 2025-10-22T14:27:51Z

@jaypoulz thank you for the PR, do you mind making the CI happy?

jaypoulz · 2025-10-22T14:34:46Z

Hi @saschagrunert :) Working on it! :D
New to this repo so working through beginner challenges 😸

jaypoulz · 2025-10-22T14:42:21Z

A few open questions I have:

This is a config object of a sort. It's created by cluster-etcd-operator only when you have a two-node cluster and only for the purposes of gathering information about the health of pacemaker (our ha tool) from the nodes. I put it in etcd/tnf (two node fencing) because it seemed sensible. But I'm not sure if it needs to be in config.

That said, it doesn't work like a normal config - there's no spec and it shouldn't be created during bootstrap. The CRD just needs to be present when the CEO runs an cronjob to post an update to it.

bash hack/update-protobuf.sh failed for me because it wanted the path to be installed in my go path. That said, cursor happily runs it and copies over the files without issue. I'm just skeptical of the zz_generated files, but I assume those are verified by CI?
For the non-boolean enum fields. Should I be creating static string definitions that can be exported to CEO? How do I generate those?

saschagrunert · 2025-10-23T12:10:01Z

Yeah, I'll ignore the CI failures for now, running ./hack/update-codegen.sh locally also gives me a diff in openapi/generated_openapi/zz_generated.openapi.go. 🙃

A few open questions I have:

This is a config object of a sort. It's created by cluster-etcd-operator only when you have a two-node cluster and only for the purposes of gathering information about the health of pacemaker (our ha tool) from the nodes. I put it in etcd/tnf (two node fencing) because it seemed sensible. But I'm not sure if it needs to be in config.

I'm new to API review, but my gut feeling tells me that a dedicated etcd API group sounds fine for that purpose.

That said, it doesn't work like a normal config - there's no spec and it shouldn't be created during bootstrap. The CRD just needs to be present when the CEO runs an cronjob to post an update to it.

bash hack/update-protobuf.sh failed for me because it wanted the path to be installed in my go path. That said, cursor happily runs it and copies over the files without issue. I'm just skeptical of the zz_generated files, but I assume those are verified by CI?

You can also try to run it in a container by make verify-with-container.

For the non-boolean enum fields. Should I be creating static string definitions that can be exported to CEO? How do I generate those?

Do you mind elaborating on that? Do you mean generating the code for the unions?

API docs ref: https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md#writing-a-union-in-go

@jaypoulz is there an OpenShift enhancement available for this change?

etcd/install.go

etcd/tnf/v1alpha1/tests/pacemakerstatuses.tnf.etcd.openshift.io/DualReplica.yaml

etcd/tnf/v1alpha1/types_pacemakerstatus.go

saschagrunert · 2025-10-28T08:52:32Z

/retest

etcd/v1alpha1/types_pacemakercluster.go

saschagrunert

LGTM from an API Shadow review perspective.

openshift-ci · 2025-10-29T07:11:08Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: saschagrunert
Once this PR has been reviewed and has the lgtm label, please assign joelspeed for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

saschagrunert · 2025-10-29T07:27:31Z

/retest

etcd/v1alpha1/types_pacemakercluster.go

JoelSpeed · 2025-10-29T10:15:29Z

Since @saschagrunert has said this is good from his side, I'll now take over the API review. Since it's shift week, I'm not expecting to pick this up until Monday

jaypoulz · 2025-10-29T13:08:18Z

Sounds good to me! :)

jaypoulz · 2025-11-04T19:34:27Z

/retest-required

jaypoulz · 2025-11-04T19:55:55Z

/retest-required

Introduces etcd.openshift.io/v1alpha1 API group with a PacemakerCluster custom resource. This provides visibility into Pacemaker cluster health for Two Node Fencing (TNF) etcd deployments. The status-only resource is populated by a privileged controller and consumed by the cluster-etcd-operator healthcheck controller. This API is not explicitly gated because it's only created by CEO once the transition to an ExternalEtcd has occured. This means that it is naturally gated by the TNF topology.

openshift-ci · 2025-11-05T01:49:27Z

@jaypoulz: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/okd-scos-e2e-aws-ovn	`1d41200`	link	false	`/test okd-scos-e2e-aws-ovn`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

jaypoulz · 2025-11-05T22:05:59Z

/retest-required

JoelSpeed · 2025-11-05T12:42:52Z

etcd/v1alpha1/tests/pacemakerclusters.etcd.openshift.io/AAA_ungated.yaml

+        kind: PacemakerCluster
+        metadata:
+          name: cluster
+        spec: {}


Generally we try to avoid allowing an empty spec to be valid. What would this object achieve if it has no spec?

It only exists to reflect status. It's not configuration, not is it trying to modify behavior or configure the cluster in any way. This is one of the reasons I wasn't sure if this really belonged in API.

JoelSpeed · 2025-11-05T12:44:06Z

etcd/v1alpha1/doc.go

+// +k8s:openapi-gen=true
+// +openshift:featuregated-schema-gen=true
+
+// +kubebuilder:validation:Optional


Please don't do this (I know this exists on other APIs but it's not right)

This changes the default behaviour for optionality of a field and has bitten many people where they thought they were making fields required and weren't

JoelSpeed · 2025-11-05T12:45:00Z

etcd/v1alpha1/Makefile

+.PHONY: verify-with-container
+verify-with-container:
+	$(MAKE) -f ../../Makefile $@
+
+.PHONY: update-with-container
+update-with-container:
+	$(MAKE) -f ../../Makefile $@


Does this actually work? I wouldn't expect it to since we don't generally maintain the update-with-container targets, and some of them require context of the entire API

Also, we would usually have a test target at this level of makefile, can you please include that

JoelSpeed · 2025-11-05T12:47:42Z

etcd/v1alpha1/README.md

+
+### Feature Gate
+
+- **Feature Gate**: None - this CRD is gated by cluster-etcd-operator start-up. It will only be created once a TNF cluster has transitioned to external etcd.


All APIs must start behind a feature gate, even in v1alpha1

I can throw it behind the DualReplica feature gate because it's already blocked by that gate. I don't think it needs its own gate since TNF and this monitor are tightly coupled. This makes more sense now that it'll be TP in 4.21.

JoelSpeed · 2025-11-05T12:50:35Z

etcd/v1alpha1/README.md

+
+The API follows a "Design Principle: Act on Deterministic Information" approach:
+- Almost all fields are optional except `lastUpdated`
+- Missing data means "unknown" not "error"


We generally prefer to populate all data with explicit unknown rather than have it omitted

Does this mean we should default to required for fields wherever possible?

JoelSpeed · 2025-11-06T14:11:19Z

etcd/v1alpha1/types_pacemakercluster.go

+	// mode indicates if the node is in active or standby mode
+	// NodeModeType can be one of the following values:
+	// - Active - the node is in active mode
+	// - Standby - the node is in standby mode
+	// When present, it must be a valid NodeModeType.
+	// When not present, the node mode is unknown. This likely indicates that there is an error parsing the raw XML output.
+	// +optional
+	Mode NodeModeType `json:"mode,omitempty"`


Also better as a condition

JoelSpeed · 2025-11-06T14:11:55Z

etcd/v1alpha1/types_pacemakercluster.go

+	// - Started - the resource is started
+	// - Stopped - the resource is stopped
+	// We don't use promoted and unpromoted, so resources in those roles would omit the role field.
+	// When present, it must be a valid ResourceRoleType.


What is a valid ResourceRoleType?

JoelSpeed · 2025-11-06T14:12:23Z

etcd/v1alpha1/types_pacemakercluster.go

+	// node is the node where the resource is running
+	// When present, it must be a valid string between 1 and 256 characters long.
+	// When not present, the resource is not assigned to a node. This typically indicates a stopped or unscheduled resource. It could also imply an error parsing the raw XML output.
+	// +kubebuilder:validation:MinLength=1
+	// +kubebuilder:validation:MaxLength=256
+	// +optional
+	Node string `json:"node,omitempty"`


Why not put the resources under the node status so it's clear which node they are running on?

JoelSpeed · 2025-11-06T14:13:23Z

etcd/v1alpha1/types_pacemakercluster.go

+// PacemakerNodeHistoryEntry represents a single operation history entry from node_history
+type PacemakerNodeHistoryEntry struct {
+	// node is the node where the operation occurred
+	// It must be a valid string between 1 and 256 characters long and cannot be empty.
+	// +kubebuilder:validation:MinLength=1
+	// +kubebuilder:validation:MaxLength=256
+	// +required
+	Node string `json:"node,omitempty"`
+
+	// resource is the resource that was operated on
+	// It must be a valid string between 1 and 256 characters long and cannot be empty.
+	// +kubebuilder:validation:MinLength=1
+	// +kubebuilder:validation:MaxLength=256
+	// +required
+	Resource string `json:"resource,omitempty"`
+
+	// operation is the operation that was performed (e.g., "monitor", "start", "stop")
+	// Unlike other fields, this is not an enum because while "monitor", "start" and "stop"
+	// are the most common, resource agents can define their own operations.
+	// It must be a valid string between 1 and 32 characters long and cannot be empty.
+	// +kubebuilder:validation:MinLength=1
+	// +kubebuilder:validation:MaxLength=32
+	// +required
+	Operation string `json:"operation,omitempty"`
+
+	// rc is the return code from the operation
+	// When present, it must be a valid integer between 0 and 2147483647 (max 32-bit int) inclusive.
+	// When not present, the return code is unknown. This likely indicates that there is an error parsing the raw XML output.
+	// +kubebuilder:validation:Minimum=0
+	// +kubebuilder:validation:Maximum=2147483647
+	// +optional
+	RC *int32 `json:"rc,omitempty"`
+
+	// rcText is the human-readable return code text (e.g., "ok", "error", "not running")
+	// When present, it must be a valid string between 1 and 32 characters long. This is a human-readable string and is not validated against any specific format.
+	// When not present, the return code text is unknown. This likely indicates that there is an error parsing the raw XML output.
+	// +kubebuilder:validation:MinLength=1
+	// +kubebuilder:validation:MaxLength=32
+	// +optional
+	RCText string `json:"rcText,omitempty"`
+
+	// lastRCChange is the timestamp when the RC last changed
+	// It must be a valid timestamp in RFC3339 format and cannot be empty.
+	// +kubebuilder:validation:Format=RFC3339
+	// +required
+	LastRCChange metav1.Time `json:"lastRCChange,omitempty"`
+}


This feels like it would be better represented as an emitted corev1.Event

That what we're using this for :)
One thing that might not be clear - why do we even need this? Can't CEO collect the status updates, produce the events, update its conditions, etc. without introducing a CRD for this?

It could for sure. But there are 2 reasons for this:

We need some of this status to persist (e.g. node IPs for node replacement events)

The source of pacemakercluster status updates could be external to the cluster entirely. (In a future update, we'd like to do pacemaker alert-agent based reporting). We could set up some kind of service account on the nodes to create multiple internal types - events, pacemakercluster, etc. but it felt cleaner to have pacemaker just give CEO it's relevant updates as a single status and have the operator decide if any events were noteworthy enough to have events.

JoelSpeed · 2025-11-06T14:14:30Z

etcd/v1alpha1/types_pacemakercluster.go

+// PacemakerFencingEvent represents a single fencing event from fence history
+type PacemakerFencingEvent struct {
+	// target is the node that was fenced
+	// It must be a valid string between 1 and 256 characters long and cannot be empty.
+	// +kubebuilder:validation:MinLength=1
+	// +kubebuilder:validation:MaxLength=256
+	// +required
+	Target string `json:"target,omitempty"`
+
+	// action is the fencing action performed
+	// FencingActionType can be one of the following values:
+	// - reboot - the node was rebooted
+	// - off - the node was turned off
+	// - on - the node was turned on
+	// When present, it must be a valid FencingActionType.
+	// When not present, the fencing action is unknown. This likely indicates that there is an error parsing the raw XML output.
+	// +optional
+	Action FencingActionType `json:"action,omitempty"`
+
+	// status is the status of the fencing operation
+	// FencingStatusType can be one of the following values:
+	// - success - the fencing event was successful
+	// - failed - the fencing event failed
+	// - pending - the fencing event is pending
+	// When present, it must be a valid FencingStatusType.
+	// When not present, the fencing status is unknown. This likely indicates that there is an error parsing the raw XML output.
+	// +optional
+	Status FencingStatusType `json:"status,omitempty"`
+
+	// completed is the timestamp when the fencing event was completed
+	// It must be a valid timestamp in RFC3339 format and cannot be empty.
+	// +kubebuilder:validation:Format=RFC3339
+	// +required
+	Completed metav1.Time `json:"completed,omitempty"`
+}


Again, why not use corev1.Event to represent these events in time?

#2544 (comment)

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 21, 2025

openshift-ci bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Oct 21, 2025

openshift-ci bot requested review from JoelSpeed and everettraven October 21, 2025 18:19

openshift-ci bot added the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Oct 21, 2025

jaypoulz force-pushed the OCPEDGE-2084 branch from 58218ce to 96e327f Compare October 21, 2025 18:23

openshift-ci bot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Oct 21, 2025

jaypoulz force-pushed the OCPEDGE-2084 branch 4 times, most recently from 2ba442d to 29b9fec Compare October 21, 2025 23:56

jaypoulz force-pushed the OCPEDGE-2084 branch from 29b9fec to 26f7821 Compare October 22, 2025 14:29

jaypoulz force-pushed the OCPEDGE-2084 branch 2 times, most recently from b0ff230 to 1b57b09 Compare October 22, 2025 16:59

openshift-ci bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 22, 2025

jaypoulz force-pushed the OCPEDGE-2084 branch 4 times, most recently from b9b727f to fdd53e9 Compare October 22, 2025 20:37

saschagrunert reviewed Oct 23, 2025

View reviewed changes

etcd/install.go Outdated Show resolved Hide resolved

saschagrunert reviewed Oct 23, 2025

View reviewed changes

etcd/tnf/v1alpha1/tests/pacemakerstatuses.tnf.etcd.openshift.io/DualReplica.yaml Outdated Show resolved Hide resolved

saschagrunert reviewed Oct 23, 2025

View reviewed changes

etcd/tnf/v1alpha1/types_pacemakerstatus.go Outdated Show resolved Hide resolved

saschagrunert reviewed Oct 28, 2025

View reviewed changes

etcd/v1alpha1/types_pacemakercluster.go Outdated Show resolved Hide resolved

etcd/v1alpha1/types_pacemakercluster.go Show resolved Hide resolved

saschagrunert reviewed Oct 28, 2025

View reviewed changes

etcd/v1alpha1/types_pacemakercluster.go Outdated Show resolved Hide resolved

jaypoulz force-pushed the OCPEDGE-2084 branch 3 times, most recently from 3e02535 to e6b5c99 Compare October 28, 2025 17:20

clumens reviewed Oct 28, 2025

View reviewed changes

jaypoulz force-pushed the OCPEDGE-2084 branch 3 times, most recently from d29f516 to cf53006 Compare October 28, 2025 23:11

saschagrunert approved these changes Oct 29, 2025

View reviewed changes

saschagrunert reviewed Oct 29, 2025

View reviewed changes

etcd/v1alpha1/types_pacemakercluster.go Show resolved Hide resolved

saschagrunert mentioned this pull request Oct 29, 2025

claude: take latest OpenShift and Kubernetes API conventions into account #2548

Open

jaypoulz force-pushed the OCPEDGE-2084 branch 2 times, most recently from 8513003 to 8c8680a Compare October 29, 2025 18:33

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 3, 2025

jaypoulz force-pushed the OCPEDGE-2084 branch from c3cea74 to 5b40d00 Compare November 4, 2025 17:13

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 4, 2025

jaypoulz force-pushed the OCPEDGE-2084 branch from b7b94f8 to 5b40d00 Compare November 4, 2025 18:30

jaypoulz force-pushed the OCPEDGE-2084 branch from 5b40d00 to 1d41200 Compare November 4, 2025 22:57

JoelSpeed reviewed Nov 6, 2025

View reviewed changes


		### Feature Gate

		- Feature Gate: None - this CRD is gated by cluster-etcd-operator start-up. It will only be created once a TNF cluster has transitioned to external etcd.

OCPEDGE-2084: Add PacemakerStatus CRD for two-node fencing #2544

Are you sure you want to change the base?

OCPEDGE-2084: Add PacemakerStatus CRD for two-node fencing #2544

Uh oh!

Conversation

jaypoulz commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci-robot commented Oct 21, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci bot commented Oct 21, 2025

Uh oh!

openshift-ci-robot commented Oct 21, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saschagrunert commented Oct 22, 2025

Uh oh!

jaypoulz commented Oct 22, 2025

Uh oh!

jaypoulz commented Oct 22, 2025

Uh oh!

saschagrunert commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

saschagrunert commented Oct 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

saschagrunert left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Oct 29, 2025

Uh oh!

saschagrunert commented Oct 29, 2025

Uh oh!

Uh oh!

JoelSpeed commented Oct 29, 2025

Uh oh!

jaypoulz commented Oct 29, 2025

Uh oh!

jaypoulz commented Nov 4, 2025

Uh oh!

jaypoulz commented Nov 4, 2025

Uh oh!

openshift-ci bot commented Nov 5, 2025

Uh oh!

jaypoulz commented Nov 5, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

jaypoulz commented Oct 21, 2025 •

edited

Loading

openshift-ci-robot commented Oct 21, 2025 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Oct 21, 2025 •

edited by openshift-ci bot

Loading

saschagrunert commented Oct 23, 2025 •

edited

Loading