Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for disabling use of TLS per domain suffix #162

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

cha7ri
Copy link

@cha7ri cha7ri commented Nov 15, 2021

Context:

We have clusters with two ingress controllers:

  • Public ingress controller using the domain example.com
  • Private ingress controller using the doamin private.example.com

What we want to achieve?

  • Use letsencrypt to issue certificates for example.com
  • Disable tls for private.example.com

Solution:

  • Set use-ingress-tls to default_on to enable tls by default for all ingresses
  • Add a new config option to FIAAS to disable tls for one or many domains. example tls-certificate-issuer-disable-for-domain-suffixes=private.example.com

What will change:

  • Group hosts that we want to disable tls for together in one ingress resource (if the user doesn't set annotations).
  • Do not add kubernetes.io/tls-acme: true or cert-manager.io/cluster-issuer: nameOfClusterIssuer annotation to the ingress resource.
  • This will tell cert-manager to not create a certificate for those hosts

@cha7ri cha7ri requested a review from a team as a code owner November 15, 2021 20:08
docs/operator_guide.md Outdated Show resolved Hide resolved
Copy link
Member

@oyvindio oyvindio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current implementation has behaviour that could be unexpected for some application configurations, see comment in-line.

tests/fiaas_deploy_daemon/test_config.py Outdated Show resolved Hide resolved
docs/operator_guide.md Outdated Show resolved Hide resolved
fiaas_deploy_daemon/deployer/kubernetes/ingress.py Outdated Show resolved Hide resolved
@birgirst birgirst changed the title disable TLS per domain suffix Add support for disabling use of TLS per domain suffix Dec 13, 2021
…is is not true because they can be part of diffirent ingress groups
Copy link
Member

@oyvindio oyvindio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for improving the tests and even adding e2e tests here. I've taken another look at this and left a few comments and questions in line

Comment on lines +179 to +183
# This change to fix the issue: when we dont have any default ingress item the ingress is added to the ingresses list anyway. Now it will be added only if we have atleast one default ingress item (host)
default_ingress = default_ingresses.setdefault("default",
AnnotatedIngress(name="", ingress_items=[], annotations={},
explicit_host=explicit_host, issuer_type=self._tls_issuer_type_default,
default=True))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the background for the change in the grouping logic+how the default ingress is set up. From my understanding of the existing flow it seems like it might not be necessary, and I think it would be good to avoid the additional complexity if possible.

As I understand it, there are already configurations where the default ingress (or any other ingress) could have no ingres items, for example if no ingress suffixes are configured and an application only has a ingress with annotations, like the following:

ingress:
  - host: example.com
    annotations:
      foo: bar 

If this happens, the default ingress should be skipped here (and deleted if it already exists):

if len(annotated_ingress.ingress_items) == 0:
LOG.info("No items, skipping: %s", annotated_ingress)
continue

Comment on lines +331 to +334
elif app_spec.ingress_tls.enabled is True:
return True
elif self._should_disable_ingress_tls(hosts) is True:
return False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could lead to unexpected behavior: If extensions.tls.enabled is set to true in the application's configuration, the "no_tls" ingress would still get a .spec.tls section, because the check on line 333 is never run as the function returned on 332.

As a sidenote, it might be a bit clearer and more efficient to add a boolean field to AnnotatedIngress which is set on the "no_tls" ingress when grouping ingress items, indicating that tls should not be enabled for this ingress.

def wait_until_fdd_ready(self, k8s_version, kubernetes, ready):
wait_until(ready, "web-interface healthy", RuntimeError, patience=PATIENCE)
if crd_supported(k8s_version):
wait_until(
crd_available(kubernetes, timeout=TIMEOUT),
"CRD available", RuntimeError, patience=PATIENCE
)

def prepare_fdd(self, request, kubernetes, k8s_version, use_docker_for_e2e, service_type, service_account=False):
def prepare_fdd(self, request, kubernetes, k8s_version, use_docker_for_e2e, service_type, service_account=False, disable_tls_for_doamin_suffixes=False, extra_args=[]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is a typo in the new parameter here; doamin should probably be domain?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name: v3-data-examples-tls-disable-tls-for-one-host
finalizers: []
spec:
tls: []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that to fully test the new behavior in this PR, the test should verify that this ingress has a .spec.tls section. Since tls is disabled (default_off) at the fiaas-deploy-daemon level and not enabled at the application config level, only the hostname grouping is tested here. (similarly in tls_disable_tls_for_one_default_host1.yml).

Comment on lines 120 to 124
try:
daemon = subprocess.Popen(args, stdout=sys.stderr, env=merge_dicts(os.environ, {"NAMESPACE": "default"}))
time.sleep(1)
if daemon.poll() is not None:
pytest.fail("fiaas-deploy-daemon has crashed after startup, inspect logs")
self.wait_until_fdd_ready(k8s_version, kubernetes_service_account, ready)
daemon=self.start_fdd(args,port,ready,k8s_version,kubernetes_per_app_service_account)
yield "http://localhost:{}/fiaas".format(port)
finally:
self._end_popen(daemon)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collecting the common logic into start_fdd here is a nice improvement. However there is a small change in behavior here which makes tests more difficult to troubleshoot: If start_fdd raises a exception (typically because wait_until_fdd_ready fails because of some issue), the exception is masked by a UnboundLocalError raised in the finally block when trying to reference daemon, since it does not have a value. I think the process may leak if this happens too, as it is not passed to _end_popen.

I only noticed this because tests using this fixture was failing and the UnboundLocalError masked a NameError because kubernetes_per_app_service_account is not defined

def ingress_tls_disable_tls_for_domain_suffixes(self, config,request):
config.tls_certificate_issuer_disable_for_domain_suffixes = request.param["tls_certificate_issuer_disable_for_domain_suffixes"]
config.tls_certificate_issuer_type_overrides = {
"other.cloud.com": "certmanager.k8s.io/issuer"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid referring to unknown third party domains (several places in this file), I think it is better to use one of example.{com,org,net} in test data such as here

)

@pytest.mark.usefixtures("delete")
def test_applies_tls_certificate_issuer_disable_for_domain_suffixes(self, deployer_disable_tls_for_domain_suffixes, ingress_tls_disable_tls_for_domain_suffixes, app_spec,ingresses_spec,expected_host_groups,hosts,use_suffixes):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test would better verify the changes in behavior in this PR if it verified the complete flow from app_spec/ingress items to the resulting ingress resources, similar to test_ingress_deploy. It seems to me that one of the essential things that should be tested is not only how ingress items are grouped into ingresses, but also which of the ingresses are set up with TLS. As I understand the test as it is now, it doesn't verify this behavior since IngressTls.apply is mocked

@oyvindio
Copy link
Member

oyvindio commented Jan 6, 2022

/sem-approve

@arealmaas
Copy link

Any expected timeline for when this is to be merged? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants