-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support marking cluster domains as FQDNs, and change the default to FQDN #939
Conversation
b509ca7
to
e286c12
Compare
Tested and works! Will wait with approval for now until:
|
772b32a
to
2a29dc3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall!
Changelog entry is missing.
Do we want to mention anything about DNS/ndots etc. here https://docs.stackable.tech/home/stable/kubernetes/#_configuring_the_cluster_domain (or the sublinks)?
Thanks for working on this! Actually I now have enough from the DomainName using the RFC_1123_SUBDOMAIN_REGEX :( I would suggest to add a new FQDN_REGEX allowing a trailing dot (as foo. is a valid domain name!) and using that, WDYT? Regarding the user input I would prefer to not fiddle with the input the user gave us, but document and warn accordingly. Maybe our dotting strategy isn't perfect and the dot actually hurts something? Currently we would have no way of disabling it. But no strong opinion. I spiked both things in main...feat/dns-performance-fqdns-review, what are your opinions on this?
In case we go with my suggestion we should definitively update the docs to recommend a trailing dot |
ad4da7f
to
2a29dc3
Compare
@sbernauer I like the idea of removing the logic from the Regarding the FQDN regex: We also thought about this, I don't think it's a bad idea but we decided to do it differently to be in line with the way Kubernetes does it: https://github.com/kubernetes/kubernetes/blob/30de989fb57fb5921a7ae3e3203cf7ecac9cf3f0/staging/src/k8s.io/apimachinery/pkg/util/validation/validation.go#L100 |
Works for me, we can use your existing logic.
I don't understand how they can opt out to be honest. I think currently they are unable to set it to |
To summarize, we can:
Is that correct? I honestly think option 1 is my favorite but this comes at the cost of not having the performance increase by default. WDYT? Edit: We decided for option 2, using another FQDN regex for DomainName and defaulting to "cluster.local.". Users can opt out by e.g. setting "cluster.local" explicitly. |
2a29dc3
to
21eb550
Compare
Co-authored-by: Malte Sander <[email protected]>
Co-authored-by: Malte Sander <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Code looks good, we have a opt out as well :)
Please add a changelog and make some changes to the https://docs.stackable.tech/home/stable/guides/kubernetes-cluster-domain/ guire
76b2d8a
to
79736d9
Compare
79736d9
to
f9d1ae4
Compare
Updates:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice thanks! Checked docs as well.
LGTM. Lets wait for @sbernauer though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Co-authored-by: Natalie Klestrup Röijezon <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code still looks good, thanks for the review @nightkr
However, I think the title should also mention that we are changing the default cluster domain, because this is probably the biggest user-facing change
Description
Fixes stackabletech/issues#656
The current regex for domain names does not permit a trailing dot. Analogous to the logic in Kubernetes, I decided to remove the trailing dot before validation (if the dot is present) instead of adjusting the regex.
After validation was successful, a trailing dot is now always appended to make the domain name a FQDN. This improves DNS performance, since for FQDNs the "search" domains in resolv.conf are not considered.
How to test this
General test setup
Local Kind cluster, add the line
log
(aftererrors
) to the CoreDNS ConfigMap to make CoreDNS log all DNS queries, scale CoreDNS Deployment to 1 so we can get all the logs from one CoreDNS Pod, restart the CoreDNS Deployment to reload the config.Install some operators:
Setup a Zookeeper cluster and a ZNode, wait for all pods to start, check CoreDNS logs
I grepped through the CoreDNS logs for
svc.cluster.local.cluster.local
, which indicates a "search" domain was appended to the DNS query (which degrades performance).Using the current dev release, ndots=5 (default value)
Create a ZK cluster with 3 replicas:
Create a dummy ZNode:
Lines like this should appear in the CoreDNS log:
The lines around these matches show how all search domains are tried and only the final query (for
simple-zk.default.svc.cluster.local.
) succeeds.Using the current dev release, PodOverrides with ndots=3
Patch the Zookeeper Operator deployment:
Clean up old ZK resources:
Restart CoreDNS deployment to clear the logs:
Recreate ZK cluster with ndots=3:
Wait for Pods to start, check CoreDNS log.
svc.cluster.local.cluster.local
should not be present in the log.Using this fix, no PodOverrides
Clean up old ZK resources:
Uninstall Zookeeper operator:
Install the Zookeeper operator with the fix from this PR (
make run-dev
).Restart CoreDNS deployment to clear the logs:
Create a normal ZK cluster without the ndots override:
Wait for Pods to start, check CoreDNS log.
svc.cluster.local.cluster.local
should not be present in the log.I additionally tested the fix with ndots set to 7, it still works even though the FQDN contains only 5 dots. I think that's because the search domains are never used when you make a DNS query for an FQDN (with a trailing dot).
Definition of Done Checklist
Author
Reviewer
Acceptance