Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traefik forward traffic to backends which are down #262

Open
gnuoy opened this issue Oct 4, 2023 · 3 comments
Open

Traefik forward traffic to backends which are down #262

gnuoy opened this issue Oct 4, 2023 · 3 comments

Comments

@gnuoy
Copy link

gnuoy commented Oct 4, 2023

Bug Description

Traefik does not seem to do an aliveness check on the backends it is forwarding traffic to. This causes client requests to fail if a backend is down.

Perhaps charms.traefik_k8s.v2.ingress should support the requirer passing a health check url ?

To Reproduce

  1. Deploy this bundle: https://opendev.org/openstack/charm-keystone-k8s/src/branch/main/tests/bundles/smoke.yaml
  2. Add a keystone unit: juju add-unit keystone
  3. Wait for unit to be ready
  4. URL=$(juju run keystone/leader get-admin-account | awk 'BEGIN {FS="="} /OS_AUTH_URL/ {print $NF}')
  5. curl $URL (repeat multiple times to check both backends are alive as traefik will round-robin the backends)
  6. juju ssh --container keystone keystone/1 "pebble stop wsgi-keystone"
  7. Repeat step 5 and every other request will be a bad gateway

Example output: https://paste.ubuntu.com/p/zHrmjFrQ7g/

Environment

juju 3.2.3-genericlinux-amd64
Controller in microk8s
Traefick charm: 1.0/candidate r148

Relevant log output

2023-10-04T09:04:43.426Z [traefik] time="2023-10-04T09:04:43Z" level=debug msg="'502 Bad Gateway' caused by: dial tcp 10.1.188.231:5000: connect: connection refused"
2023-10-04T09:04:44.446Z [traefik] time="2023-10-04T09:04:44Z" level=debug msg="'502 Bad Gateway' caused by: dial tcp 10.1.188.231:5000: connect: connection refused"

Additional context

No response

@PietroPasotti
Copy link
Collaborator

could definitely wrap around https://doc.traefik.io/traefik/routing/services/#health-check
we'll discuss prioritization in the next backlog refinement

@sed-i
Copy link
Contributor

sed-i commented Oct 11, 2023

Path to a health check seems like a reasonable addition to the reldata schema.

@gboutry
Copy link

gboutry commented Aug 12, 2024

@PietroPasotti We are still hitting this issue, is there any news?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants