Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement external sharder controller #31

Merged
merged 11 commits into from
Nov 16, 2023
Merged

Conversation

timebertt
Copy link
Owner

@timebertt timebertt commented Nov 16, 2023

This PR is part 3 of implementing the new external sharding approach after #30.

This PR tackles:

  • adding the external sharder controller: moves/drains objects whenever an existing shard becomes unavailable or a new shard becomes available, also performs periodic resyncs in case we missed some webhook requests
  • adding a namespaceSelector to ClusterRings: this is used in the webhook config but also for filtering objects in the sharder controller
  • fix the sharder webhook to construct individual patch operations instead of calculating the entire patch: all data/spec was removed by the patch, also makes webhook more performant
  • fixing the hash ring cache not to rely on the state label (which is added for observability purposes only)

It performs a few smaller improvements like handling dry-run in the sharder webhook, and minimizing reconciliations of dummy shard for easier debugging.

webhosting-operator will be adapted to the external sharder implementation in the next PR.

This help in debugging when observing individual watch events.
The previous approach was buggy and removed all data/spec via its patches.
Constructing individual patch operations is more precise and way more performant than marshalling twice for calculating the patch.
We don't want to rely on the state label being maintained corretly in the sharder webhook or controller.
Determining the state directly based on the local clock is favorable as it makes it easier to trigger in predicates when the controller will actually act on changed lease data.
This increases the rate of ring calculations again, but is more consistent and correct.
This was a premature optimization. We can still optimize ring calculations again when it becomes real problem.
@timebertt timebertt force-pushed the external-sharder-controller branch from f1f12c8 to b4a8fb9 Compare November 16, 2023 11:22
@timebertt timebertt merged commit 11343cb into master Nov 16, 2023
@timebertt timebertt deleted the external-sharder-controller branch November 16, 2023 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant