Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: restart minio services must happened at config changed when decommission completed #2140

Closed

Conversation

jiuker
Copy link
Contributor

@jiuker jiuker commented Jun 2, 2024

restart minio services must happened at config changed when decommisson competed
When decommisson completed, we are facing this for pools pods:

1. stat the config.env changed at 03:19:50
2. minio service restart at 03:19:47
3. others pools pods still log pool-1 that should been removed

test steps:

  1. make a dev sidecar image
cd sidecar
make docker
  1. make a dev operator image
make operator
  1. helm install operator with env OPERATOR_SIDECAR_IMAGE="your sidecar dev image"
  2. Deploy a multi pools tenant.`. And wait the tenant get be ready
  3. do a decommisson , you can ignore this step if you have a fresh install without data.
  4. Remove pool from tenant. You will see sidecar log patched pod annotations and delete the statefulset successful.

@jiuker jiuker marked this pull request as draft June 2, 2024 12:32
@jiuker jiuker changed the title restart pod must happened at config changed fix: restart pod must happened at config changed when decommisson Jun 3, 2024
@jiuker jiuker changed the title fix: restart pod must happened at config changed when decommisson fix: restart pod must happened at config changed when decommisson compeleted Jun 3, 2024
@jiuker jiuker changed the title fix: restart pod must happened at config changed when decommisson compeleted fix: restart minio services must happened at config changed when decommisson compeleted Jun 3, 2024
@jiuker jiuker marked this pull request as ready for review June 3, 2024 03:35
pkg/controller/pools.go Outdated Show resolved Hide resolved
@jiuker jiuker requested a review from shtripat June 3, 2024 07:29
@ramondeklein ramondeklein changed the title fix: restart minio services must happened at config changed when decommisson compeleted fix: restart minio services must happened at config changed when decommission completed Jun 3, 2024
Copy link
Contributor

@ramondeklein ramondeklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this PR should ensure that the pool is restarted after decomissioning, but it now waits until all pods have the same generation as the Tenant resource?

pkg/controller/pools.go Outdated Show resolved Hide resolved
sidecar/go.mod Outdated Show resolved Hide resolved
pkg/common/const.go Outdated Show resolved Hide resolved
@jiuker
Copy link
Contributor Author

jiuker commented Jun 3, 2024

If I understand correctly, this PR should ensure that the pool is restarted after decomissioning, but it now waits until all pods have the same generation as the Tenant resource?

Tenant resource will generate env config, sidecar will watch the Tenant and renew.

Yes. As you can see

1. stat the config.env changed at 03:19:50
2. minio service restart at 03:19:47
3. others pools pods still log pool-1 that should been removed

minio server restart before the env changed. So used a outdate env to restart.
Before decomission:
Volumes: pool1 pool2 pool3
After decomission:
Without the pr, Volumes could be pool1 pool2 pool3
With this pr, Volumes must be pool1 pool3 (if pool2 need been removed) @ramondeklein

@jiuker jiuker requested a review from ramondeklein June 3, 2024 14:20
@ramondeklein

This comment was marked as off-topic.

ramondeklein
ramondeklein previously approved these changes Jun 3, 2024
Copy link
Contributor

@ramondeklein ramondeklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion, but not really important.

pkg/controller/pools.go Outdated Show resolved Hide resolved
@jiuker jiuker requested a review from cniackz June 4, 2024 05:51
@pjuarezd
Copy link
Member

pjuarezd commented Jun 4, 2024

please rebase this PR @jiuker would like to see how tests do

Copy link
Contributor

@ramondeklein ramondeklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's required to build with Go v1.21.11 and not an earlier version, then I guess we should also ensure that we use at least this version to build the code? If any v1.21 version is fine, then I suggest not to fix the version.

@jiuker
Copy link
Contributor Author

jiuker commented Jun 5, 2024

If it's required to build with Go v1.21.11 and not an earlier version, then I guess we should also ensure that we use at least this version to build the code? If any v1.21 version is fine, then I suggest not to fix the version.

https://github.com/minio/operator/actions/runs/9376441443/job/25816378764 But have an govulncheck failed here @ramondeklein

@jiuker jiuker requested a review from ramondeklein June 5, 2024 10:23
@cniackz
Copy link
Contributor

cniackz commented Jul 10, 2024

Jiuker, can we split this PR in multiple atomic ones, I think having this complex task all at once is maybe not the best idea... what do you think?

@jiuker
Copy link
Contributor Author

jiuker commented Jul 11, 2024

This is a complete fix, the problem now is that pjuarezd test does not pass. But I think that should be his develop step error @cniackz . Could you test this?

@jiuker jiuker force-pushed the decomm_delete_sts_must_annotationed branch 2 times, most recently from a4eecd9 to ab06c1e Compare July 11, 2024 00:25
@cniackz cniackz marked this pull request as ready for review July 14, 2024 08:02
@cniackz cniackz requested a review from ramondeklein July 14, 2024 08:03
cniackz
cniackz previously approved these changes Jul 14, 2024
@cniackz
Copy link
Contributor

cniackz commented Jul 14, 2024

@jiuker, please read Harsha's comment on Slack and decide whether to continue this way or simplify things by restarting the service instead.

sidecar/pkg/sidecar/sidecar_utils.go Outdated Show resolved Hide resolved
sidecar/pkg/sidecar/sidecar_utils.go Outdated Show resolved Hide resolved
sidecar/pkg/sidecar/sidecar_utils.go Outdated Show resolved Hide resolved
sidecar/pkg/sidecar/sidecar_utils.go Outdated Show resolved Hide resolved
pkg/controller/pools.go Outdated Show resolved Hide resolved
@harshavardhana
Copy link
Member

@jiuker, please read Harsha's comment on Slack and decide whether to continue this way or simplify things by restarting the service instead.

Adding this flow is important because we shouldn't be restarting services if not all pods have the same tenant configuration.

sidecar/pkg/sidecar/sidecar_utils.go Outdated Show resolved Hide resolved
sidecar/pkg/sidecar/sidecar_utils.go Outdated Show resolved Hide resolved
@jiuker jiuker requested a review from harshavardhana July 16, 2024 02:17
harshavardhana
harshavardhana previously approved these changes Jul 16, 2024
@harshavardhana
Copy link
Member

PTAL @pjuarezd @cniackz

@harshavardhana
Copy link
Member

PTAL @pjuarezd @cniackz

ping again on this.

jiuker added 4 commits July 16, 2024 19:03
restart pod must happened at config changed
lint
apply suggestion
use klog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants