You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
If you are interested in working on this issue or have submitted a pull request, please leave a comment.
Overview of the Issue
I ran into several issues when installing a new Consul cluster on Kubernetes using the Helm chart. When configuring the Helm chart to manage ACLs for the cluster, I found a circular dependency, which prevented me from installing the cluster without modifying the manifests using kustomize.
Reproduction Steps
In order to effectively and quickly resolve the issue, please provide exact steps that allow us the reproduce the problem. If no steps are provided, then it will likely take longer to get the issue resolved. An example that you can follow is provided below.
Steps to reproduce:
Install the helm chart using the following values:
If the secret does not exist (which it doesn't for a clean install), this will prevent the server from starting
The consul-k8s-controlplane server-acl-init job will not complete, since it can't resolve the DNS name of the headless server, since no pods are running
I realized, that the .global.acls.bootstrapToken config option is probably meant to be set, if you already have a bootstrap token. This should be made clearer in the docs, if this is the case.
One possible workaroung I tried was creating an empty secret with the same name, which allows the consul servers to start and the server-acl-init job to successfully initialize the ACL systems, but the job ultimately fails, since it always tries to create the secret and not update it, contradicting the documentation.
The other possibility is to remove .global.acls.bootstrapToken, which removes the env var from the StatefulSet. This works, the job is initializing the ACL system and creates the secret with the token, but I ran into another issue with the server-acl-init job:
The job resolves the IPs of the consul servers from the DNS name of the headless service, which contains a race condition, because the DNS only returns IP addresses from started Pods (obviously). In my testing it was almost always the case, that the job only received one or two out of three IP addresses, leading to only these consul servers receiving a server token and the remaining servers unable to communicate using acl. A fix for that would be for the job to know, how many servers to expect (like the -bootstrap-expect).
Workaround solution
I solved these issues with a kustomization patch to the acl init job:
Community Note
Overview of the Issue
I ran into several issues when installing a new Consul cluster on Kubernetes using the Helm chart. When configuring the Helm chart to manage ACLs for the cluster, I found a circular dependency, which prevented me from installing the cluster without modifying the manifests using kustomize.
Reproduction Steps
In order to effectively and quickly resolve the issue, please provide exact steps that allow us the reproduce the problem. If no steps are provided, then it will likely take longer to get the issue resolved. An example that you can follow is provided below.
Steps to reproduce:
Install the helm chart using the following values:
Actual behavior
This will lead to two things:
consul-k8s-controlplane server-acl-init
job will not complete, since it can't resolve the DNS name of the headless server, since no pods are runningI realized, that the
.global.acls.bootstrapToken
config option is probably meant to be set, if you already have a bootstrap token. This should be made clearer in the docs, if this is the case.One possible workaroung I tried was creating an empty secret with the same name, which allows the consul servers to start and the
server-acl-init
job to successfully initialize the ACL systems, but the job ultimately fails, since it always tries to create the secret and not update it, contradicting the documentation.The other possibility is to remove
.global.acls.bootstrapToken
, which removes the env var from the StatefulSet. This works, the job is initializing the ACL system and creates the secret with the token, but I ran into another issue with theserver-acl-init
job:The job resolves the IPs of the consul servers from the DNS name of the headless service, which contains a race condition, because the DNS only returns IP addresses from started Pods (obviously). In my testing it was almost always the case, that the job only received one or two out of three IP addresses, leading to only these consul servers receiving a server token and the remaining servers unable to communicate using acl. A fix for that would be for the job to know, how many servers to expect (like the
-bootstrap-expect
).Workaround solution
I solved these issues with a kustomization patch to the acl init job:
and the
nslookup.sh
script:Environment details
If not already included, please provide the following:
consul-k8s
version: 1.4.0Additionally, please provide details regarding the Kubernetes Infrastructure, as shown below:
Philipp Schöppner <[email protected]>, Mercedes-Benz Tech Innovation GmbH (Provider Information)
The text was updated successfully, but these errors were encountered: