-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DPE-3684] Implement DA139 #663
base: dpe-3684-reinitialise-raft
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dpe-3684-reinitialise-raft #663 +/- ##
==============================================================
- Coverage 72.15% 71.78% -0.38%
==============================================================
Files 15 15
Lines 3426 3466 +40
Branches 528 536 +8
==============================================================
+ Hits 2472 2488 +16
- Misses 827 846 +19
- Partials 127 132 +5 ☔ View full report in Codecov by Sentry. |
self.framework.observe( | ||
self.charm.on.promote_to_primary_action, self._on_promote_to_primary | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to the main charm code, since it's no longer used only for async promotion.
try: | ||
health_status = self.get_patroni_health() | ||
except Exception: | ||
logger.warning("Remove raft member: Unable to get health status") | ||
health_status = {} | ||
if health_status.get("role") in ("leader", "master") or health_status.get( | ||
"sync_standby" | ||
): | ||
logger.info(f"{self.charm.unit.name} is raft candidate") | ||
data_flags["raft_candidate"] = "True" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait for the action to start reinit
@@ -746,15 +747,18 @@ def stop_patroni(self) -> bool: | |||
logger.exception(error_message, exc_info=e) | |||
return False | |||
|
|||
def switchover(self) -> None: | |||
def switchover(self, candidate: str | None = None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass a candidate when promoting a specific unit.
for unit in units: | ||
logger.info(f"Stopping unit {unit}") | ||
await stop_machine(ops_test, await get_machine_from_unit(ops_test, unit)) | ||
await sleep(15) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sleep for the Juju leadership to drift.
# Check if Patroni self healed | ||
assert ( | ||
left_unit.workload_status == "active" | ||
and left_unit.workload_status_message == "Primary" | ||
) | ||
logger.warning(f"Patroni self-healed without raft reinitialisation for roles {roles}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes when removing the primary and async replica, Patroni manages to survive, so adding an exception for this case. Should I nail it down further?
Implement DA139:
promote-to-primary
to promote units and reinitialise RAFT