Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traefik is stuck executing after the setup of some ingress relation failed #366

Open
motjuste opened this issue May 29, 2024 · 1 comment

Comments

@motjuste
Copy link

motjuste commented May 29, 2024

Bug Description

We at SolutionsQA have noticed a failing deployment of COS where the unit traefik stays stuck executing for 4 hours. The debug logs in the relevant crashdump show only the following errors for the unit but no apparent exception details (the full file is attached).

8138:unit-traefik-0: 2024-05-28 20:07:14 ERROR unit.traefik/0.juju-log ingress:23: failed processing the ingress relation <ops.model.Relation ingress-per-unit:7>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:7> wiped.'
8158:unit-traefik-0: 2024-05-28 20:07:14 ERROR unit.traefik/0.juju-log ingress:23: failed processing the ingress relation <ops.model.Relation ingress-per-unit:8>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:8> wiped.'
8187:unit-traefik-0: 2024-05-28 20:07:14 ERROR unit.traefik/0.juju-log ingress:23: failed processing the ingress relation <ops.model.Relation traefik-route:9>: 'provider is not ready: ingress for <ops.model.Relation traefik-route:9> wiped.'
8200:unit-traefik-0: 2024-05-28 20:07:14 ERROR unit.traefik/0.juju-log ingress:23: The setup of some ingress relation failed, see previous logs
9473:unit-traefik-0: 2024-05-28 20:07:20 ERROR unit.traefik/0.juju-log ingress:10: failed processing the ingress relation <ops.model.Relation ingress-per-unit:7>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:7> wiped.'
9497:unit-traefik-0: 2024-05-28 20:07:20 ERROR unit.traefik/0.juju-log ingress:10: failed processing the ingress relation <ops.model.Relation ingress-per-unit:8>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:8> wiped.'
9577:unit-traefik-0: 2024-05-28 20:07:20 ERROR unit.traefik/0.juju-log ingress:10: The setup of some ingress relation failed, see previous logs
12471:unit-traefik-0: 2024-05-28 20:07:35 ERROR unit.traefik/0.juju-log failed processing the ingress relation <ops.model.Relation ingress-per-unit:7>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:7> wiped.'
12500:unit-traefik-0: 2024-05-28 20:07:35 ERROR unit.traefik/0.juju-log failed processing the ingress relation <ops.model.Relation ingress-per-unit:8>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:8> wiped.'
12518:unit-traefik-0: 2024-05-28 20:07:36 ERROR unit.traefik/0.juju-log The setup of some ingress relation failed, see previous logs
13465:unit-traefik-0: 2024-05-28 20:07:43 ERROR unit.traefik/0.juju-log failed processing the ingress relation <ops.model.Relation ingress-per-unit:7>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:7> wiped.'
13477:unit-traefik-0: 2024-05-28 20:07:43 ERROR unit.traefik/0.juju-log failed processing the ingress relation <ops.model.Relation ingress-per-unit:8>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:8> wiped.'
13515:unit-traefik-0: 2024-05-28 20:07:44 ERROR unit.traefik/0.juju-log The setup of some ingress relation failed, see previous logs
14041:unit-traefik-0: 2024-05-28 20:07:47 ERROR unit.traefik/0.juju-log failed processing the ingress relation <ops.model.Relation ingress-per-unit:7>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:7> wiped.'
14053:unit-traefik-0: 2024-05-28 20:07:47 ERROR unit.traefik/0.juju-log failed processing the ingress relation <ops.model.Relation ingress-per-unit:8>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:8> wiped.'
14083:unit-traefik-0: 2024-05-28 20:07:48 ERROR unit.traefik/0.juju-log The setup of some ingress relation failed, see previous logs
14557:unit-traefik-0: 2024-05-28 20:07:51 ERROR unit.traefik/0.juju-log ingress-per-unit:7: failed processing the ingress relation <ops.model.Relation ingress-per-unit:8>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:8> wiped.'
14650:unit-traefik-0: 2024-05-28 20:07:51 ERROR unit.traefik/0.juju-log ingress-per-unit:7: The setup of some ingress relation failed, see previous logs
15449:unit-traefik-0: 2024-05-28 20:07:57 ERROR unit.traefik/0.juju-log certificates:27: failed processing the ingress relation <ops.model.Relation ingress-per-unit:8>: 'provider is not ready: ingress for <ops.model.Relation ingress-per-unit:8> wiped.'
15519:unit-traefik-0: 2024-05-28 20:07:58 ERROR unit.traefik/0.juju-log certificates:27: The setup of some ingress relation failed, see previous logs

To Reproduce

We are using FCE to deploy microk8s on top of MAAS using Juju, and then deploying COS (relevant SKU being fkb-master-kubernetes-focal-baremetal-kubeflow). See below for identified recurrence of this issue in other scenarios.

Environment

  • Juju 3.5/candidate (3.5.1 rev27225)
  • microk8s charm 1.28/stable (rev213)

More details available for an example run can be found here. See below for other identified recurrences of this issue at SolutionsQA.

Relevant log output

See above.  The full debug-logs can be found in this [crashdump](https://oil-jenkins.canonical.com/artifacts/f2fe5cfc-ae75-4e82-84ee-0ce88dc484d4/generated/generated/cos/juju-crashdump-cos-2024-05-29-00.08.32.tar.gz).

Additional context

More identified occurrences of this issue can be found here.

@PietroPasotti
Copy link
Collaborator

PietroPasotti commented May 30, 2024

Possibly related to traefik deleting the local app databag when validation of the remote data fails.
Traefik listens to created, joined, changed (legacy). Probably should only listen to changed instead.

Details and crashdump artifacts: see https://oil-jenkins.canonical.com/artifacts/f2fe5cfc-ae75-4e82-84ee-0ce88dc484d4/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants