-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: ability to disable pod-security labels added by tigera/operator #2872
Comments
The operator is doing as expected and making sure configuration stays as it expects. Is that webhook running as a pod? If so you would need to have at least one pod running so the webhook is functional. Without at least one node running then your cluster is effectively broken since it is configured with a webhook which is served by a pod but that pod can't exist without a node. Does this issue happen for you only when you've got a cluster which had nodes, scaled to zero nodes, and then when it tries to scale back up it cannot? |
sure - my cluster was broken. It happened during an update. I can scale down the webhook deployment and see the same behavior. When the webhook is up and running, new nodes can join the cluster. |
It seem to me like the correct fix would be to either:
With zero nodes and the webhook you effectively have a broken cluster because a service that you have configured which the cluster needs to function is not available. |
Thanks. This is a workaround I am already aware of. I want to remove these labels. |
Nobody has an idea how to permanently remove custom labels on the calico* namespaces? |
I do not think we would change the operator to support what you are asking. The only reason this is a problem is because a webhook you have configured is not functional in your cluster effectively meaning your cluster is in a broken state without the webhook functioning. I would equate this to be like if you did not have components (like kube-proxy or another necessary kubernetes component needed in a functional cluster), the operator has dependencies it expects and if a cluster has pod-security functionality configured then the operator expects it to be functional. A few possible options:
|
I am bit confused here, I don't think I have a clear understanding of the problem statement. But here are the relevant cases, I think:
I don't really understand where the webhook comes into play here based on the discussion above. Could you share:
|
Hi, Thanks for your response. I added my own labels to the calico-system namespace, for example My intention was to "show" that the namespace is running with full privileges in terms of pod security admission. However it turned out that my custom labels "tell" kubernetes to check pods to be created against pod security admission. So during a difficult upgrade all nodes were removed and we had to create a new nodegroup. So I want to remove my custom labels e.g. Hops this answers your questions. Best... |
Excellent questions Casey. But just to make sure I understand, I'll put what I understand in my own words. One thing to clarify, from your original steps
The controller that this message comes from does nothing with the calico-system namespace. This does not suggest that the tigera/operator put your label back on the namespace. I would not expect the tigera/operator to re-add a label that you previously added and then removed. I've quickly tested on a cluster I have and I was able to add a label to the calico-system namespace and then was able to remove it. (This was not using the exact same |
One thing to add - the tigera/operator does set some pod-security labels here: operator/pkg/render/namespaces.go Lines 99 to 100 in e48f3d3
Namely, it always sets the The interesting part that I hadn't realized is that these pod-security annotations aren't implemented natively within the apiserver for EKS, rather they use a
So if I understand correctly, it sounds like there is a deadlock sitaution here. The possible resolutions would be:
Removing the |
oh yes, we come closer! And yes, when for example I use aws eks 1.24 there is no pod security admission in the cluster enabled. the aws support mentioned that they do this when it is out of beta. So I had to implement the pod security webhook. And yes you see now the deadlock situation. But for good reasons I am not happy with workarounds. You have to document them and in an emergency we had nobody will find that piece of documentation of course. So for me a possible solution would be that these pod-security labels are optional. They can be enabled by default of course. |
tigera is reverting custom labels on namespace calico-system
Expected Behavior
custom labels for namespaces should be deletable and not be reverted to its former state.
Current Behavior
Currently I need to delete a custom label on the namespace calico-system. Technically I can do this, but tigera-operator reverts my change within some seconds.
Possible Solution
Steps to Reproduce (for bugs)
Context
label for the pod-security-webhook are set. they prevent the creation of new nodes - or better to setup new node for joing the cluster. Especially when there is no node at all, I cannot start the calico pods due to the non-reachability of the webhook.
Your Environment
The text was updated successfully, but these errors were encountered: