Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP 5067: Pod Generation KEP creation #5068

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

natasha41575
Copy link
Contributor

  • One-line PR description: Create new KEP 5067: Pod Generation
  • Other comments:

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jan 21, 2025
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jan 21, 2025
@natasha41575 natasha41575 mentioned this pull request Jan 21, 2025
4 tasks
@natasha41575
Copy link
Contributor Author

/cc @tallclair

@natasha41575 natasha41575 changed the title KEP 5067: Initial KEP creation KEP 5067: Pod Generation KEP creation Jan 21, 2025
@tallclair tallclair self-assigned this Jan 22, 2025
Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some early feedback; hope it helps

keps/sig-node/5067-pod-generation/kep.yaml Show resolved Hide resolved
keps/sig-node/5067-pod-generation/kep.yaml Outdated Show resolved Hide resolved
keps/sig-node/5067-pod-generation/kep.yaml Outdated Show resolved Hide resolved
keps/sig-node/5067-pod-generation/README.md Outdated Show resolved Hide resolved
@natasha41575 natasha41575 marked this pull request as ready for review January 22, 2025 16:27
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 22, 2025
@natasha41575 natasha41575 force-pushed the pod-generation-KEP branch 6 times, most recently from a0da338 to 2539f5b Compare January 22, 2025 22:01
@tallclair
Copy link
Member

/assign @liggitt

Mind being the API approver for this?

@natasha41575 natasha41575 force-pushed the pod-generation-KEP branch 6 times, most recently from 2baae8f to e929629 Compare February 7, 2025 21:00
keps/sig-node/5067-pod-generation/README.md Outdated Show resolved Hide resolved
The `status.observedGeneration` feature can be disabled by setting the flag to 'false'
and restarting the Kubelet.

The `metadata.generation` functionality will intentionally not be behind a feature gate so cannot be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so when it is disabled on kubelet, it will stop setting it. Will it result in infinite loop of reconciliation as it sees that the observedGeneration is set on API server, but it is not setting it. So it wants to update it to empty, but API server doesn't allow it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right... let me make sure I have this right

  1. the kubelet has the feature enabled, and observedGeneration is set on a pod
  2. the feature is disabled on the kubelet
  3. the kubelet stops setting observedGeneration, so tries to "clear" it
  4. when the API server sees the observedGeneration unset in the incoming, it preserves the existing value, not letting it be cleared
  5. Repeat 3-4 forever

I'm not sure what is the right solution is. Is the best thing for the API server to allow observedGeneration to be cleared? I can see this being a problem with older clients that don't know about the observedGeneration field (similar to the risk you raised about older webhooks). But seems like making sure the feature can be disabled on kubelet is more important?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubelet can keep setting the observedGeneration if feature gate is disabled (it will either generate it from the pod spec if FG is on or just take whatever observedGeneration status already have and preserve it if FG is off)

@liggitt do we have a precedence here?

Copy link
Member

@liggitt liggitt Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've certainly used presence of API fields to control kubelet behavior instead of or in combination with feature gates.

I think a reasonable path in the kubelet here is to use feature enabled || pod status has non-zero observedGeneration to decide whether to propagate metadata.generation to status.observedGeneration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a reasonable path in the kubelet here is to use feature enabled || pod status has non-zero observedGeneration to decide whether to propagate metadata.generation to status.observedGeneration.

SGTM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the complexity of the logic in kubelet to calculate the "new" value of observedGeneration, we may be better off by preserving the existing value if FG is disabled instead of calculating a new one. Thank you for confirming 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the right way to set the new value of observedGeneration is to have the kubelet pass down the value of whatever metadata.generation is when it starts the sync loop all the way down to wherever it updates the pod status. From perusing through the source code, I don't anticipate this logic to be very complicated. So if the main concern is the complexity of the logic, I think we should be fine continuing to update observedGeneration after the FG is disabled rather than preserving the existing value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I folded this into the body of the KEP under "How can this feature be enabled / disabled in a live cluster?"

I'll wait for @SergeyKanzhelev's confirmation before closing this thread.

@natasha41575
Copy link
Contributor Author

/milestone v1.33

@k8s-ci-robot
Copy link
Contributor

@natasha41575: You must be a member of the kubernetes/milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your Milestone Maintainers Team and have them propose you as an additional delegate for this responsibility.

In response to this:

/milestone v1.33

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@natasha41575 natasha41575 force-pushed the pod-generation-KEP branch 2 times, most recently from 495d36e to 16073b2 Compare February 10, 2025 22:03
Copy link
Member

@johnbelamaric johnbelamaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with this for PRR. I think there is one update needed capturing the results of the kubelet behavior discussion.

keps/sig-node/5067-pod-generation/kep.yaml Show resolved Hide resolved
keps/sig-node/5067-pod-generation/README.md Outdated Show resolved Hide resolved
The `status.observedGeneration` feature can be disabled by setting the flag to 'false'
and restarting the Kubelet.

The `metadata.generation` functionality will intentionally not be behind a feature gate so cannot be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

@natasha41575 natasha41575 force-pushed the pod-generation-KEP branch 6 times, most recently from 190cbaf to 81ec184 Compare February 11, 2025 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
Status: API review completed, 1.33
Development

Successfully merging this pull request may close these issues.

9 participants