Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a design-document for the controller #181

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

Conversation

lllamnyp
Copy link
Collaborator

@lllamnyp lllamnyp commented Apr 18, 2024

Motivation

I started some "R'n'D" (scare quotes intended) for implementing scale up, scale down, self-healing and so on and quickly realized, that the coding of the member add/member remove and similar steps is the more trivial part of the undertaking. The difficult part is coming up with a working algorithm that can correctly deduce the cluster's state and execute the necessary actions at the right time.

To better reason about the controller's algorithm now, and to better develop it going forward, I feel it is important to have good documentation of the current design and the intended next steps, so I started with trying to document the current state of the code.

Results

This document contains a mermaid flowchart that outlines the reconciliation loop. It is better viewed in rendered form.

Going forward, I envision this document to have at least three purposes:

  • Let the developers spot flaws and prompt them to open issues.
  • Act as a more detailed form of documentation for advanced users.
  • Be a blueprint for implementing anything non-trivial.

@lllamnyp lllamnyp marked this pull request as draft April 18, 2024 21:53
@github-actions github-actions bot added the documentation Improvements or additions to documentation label May 3, 2024
@lllamnyp lllamnyp marked this pull request as ready for review May 9, 2024 07:24
@lllamnyp lllamnyp changed the title Draft: Create a design-document for the controller Create a design-document for the controller May 9, 2024
@kvaps
Copy link
Member

kvaps commented May 14, 2024

Could you please move to design subdirectory into website

@hiddenmarten
Copy link
Member

hiddenmarten commented May 19, 2024

I am a bit confused with this section of flow:
image

I would suggest a bit updated way to control resources in this case:
sts-flow

@hiddenmarten
Copy link
Member

hiddenmarten commented May 19, 2024

Also, I didn't get the purpose of the steps:

  • "Promote any learners."
  • "Ensure StatefulSet with replicas = max member ordinal + 1"

I mean, could you please list cases that we avoid using these checks?

@kvaps
Copy link
Member

kvaps commented Jul 2, 2024

Let's move it into architecture in docs, and we can merge it after v0.3

lllamnyp added a commit that referenced this pull request Aug 12, 2024
This commit includes all changes for feature #181 that have not been split up into small stacked PRs. It should not be merged and will later be undone and split into smaller logical chunks of work.
kvaps pushed a commit that referenced this pull request Aug 15, 2024
…th check procedure (#252)

This is the first PR in a series of [stacked
PRs](https://www.stacking.dev/), aimed ultimately at implementing the
features described in #181 and #207. The next PR in the stack can be
found at #259.
lllamnyp added a commit that referenced this pull request Sep 10, 2024
This commit includes all changes for feature #181 that have not been split up into small stacked PRs. It should not be merged and will later be undone and split into smaller logical chunks of work.
lllamnyp added a commit that referenced this pull request Sep 17, 2024
This commit includes all changes for feature #181 that have not been split up into small stacked PRs. It should not be merged and will later be undone and split into smaller logical chunks of work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants