Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT + WIP [DPE-5661] patch stateful set to ensure pod isn't removed prematurely #348

Open
wants to merge 9 commits into
base: 6/edge
Choose a base branch
from

Conversation

MiaAltieri
Copy link
Contributor

@MiaAltieri MiaAltieri commented Oct 14, 2024

Issue

  1. juju only gives us 30 seconds to clean up a unit being removed

Solution

  1. patch stateful set to have a very long termination period

@MiaAltieri MiaAltieri force-pushed the DPE-5661-safe-drainage branch from 2a9b2bf to 2a9415f Compare October 14, 2024 15:00
@MiaAltieri MiaAltieri force-pushed the DPE-5661-safe-drainage branch from 2a9415f to 9c88b59 Compare October 15, 2024 06:55
@MiaAltieri MiaAltieri force-pushed the DPE-5661-safe-drainage branch from 0af8c24 to fc992f0 Compare October 15, 2024 14:16
@MiaAltieri MiaAltieri changed the title DRAFT + WIP use mongodb-k8s removal strategy + move to storage detached DRAFT + WIP [DPE-5665][DPE-5661] patch stateful set to ensure pod isn't removed prematurely + handle scale down before storage is removed Oct 15, 2024
Comment on lines +793 to +796
# updating the termination grace period is only useful for shards, whose sudden removal
# can result in data-loss
if not self.is_role(Config.Role.SHARD):
return
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im unsure about this bit

Gu1nness
Gu1nness previously approved these changes Oct 15, 2024
Copy link
Contributor

@Gu1nness Gu1nness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it should be working.
I would try to leave it to run and play with it to see how stable it stays in the time

Comment on lines +45 to +47
from lightkube import Client
from lightkube.resources.apps_v1 import StatefulSet
from lightkube.types import PatchType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to import lightkube in pyproject.toml

src/charm.py Outdated Show resolved Hide resolved
src/charm.py Show resolved Hide resolved
"spec": {
"template": {
"spec": {"terminationGracePeriodSeconds": ONE_YEAR},
"metadata": {"annotations": {"force-update": str(int(time.time()))}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What it the force-update annotation useful for here?

patch_data = {
"spec": {
"template": {
"spec": {"terminationGracePeriodSeconds": ONE_YEAR},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't one year a bit too much?
It means that your pod can take up to 1 year to be removed.
I would keep it lower, up to a day?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose a ridiculously long amount on purpose to ensure that shard drainage would have enough time in worst case scenarios.

i.e. if there are jumbo-chunks that we cannot automatically drain (because the user should decide the new shard key) - then that will require user intervention and I would like to give them more than enough time to do so.

src/charm.py Show resolved Hide resolved
name=self.app.name,
namespace=self.model.name,
obj=patch_data,
patch_type=PatchType.MERGE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think MERGE or STRATEGIC don't change anything here so I would not specify it and keep the default value of STRATEGIC, but it's a nitpick.

Copy link
Contributor

@Gu1nness Gu1nness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it should be working.
I would try to leave it to run and play with it to see how stable it stays in the time

@MiaAltieri MiaAltieri changed the title DRAFT + WIP [DPE-5665][DPE-5661] patch stateful set to ensure pod isn't removed prematurely + handle scale down before storage is removed DRAFT + WIP [DPE-5661] patch stateful set to ensure pod isn't removed prematurely Oct 17, 2024
@MiaAltieri MiaAltieri force-pushed the DPE-5661-safe-drainage branch 2 times, most recently from 81d2589 to d51f6a2 Compare October 17, 2024 07:18
@MiaAltieri MiaAltieri force-pushed the DPE-5661-safe-drainage branch from d51f6a2 to f726e56 Compare October 17, 2024 07:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants