Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup logging on CAPA installations #2935

Closed
5 tasks done
Tracked by #2951 ...
TheoBrigitte opened this issue Oct 31, 2023 · 9 comments
Closed
5 tasks done
Tracked by #2951 ...

Setup logging on CAPA installations #2935

TheoBrigitte opened this issue Oct 31, 2023 · 9 comments
Assignees
Labels
epic/logging provider/cluster-api-aws Cluster API based running on AWS team/atlas Team Atlas

Comments

@TheoBrigitte
Copy link
Member

TheoBrigitte commented Oct 31, 2023

Towards: #311

In order to expand our logging infrastructure coverage we should extend support to CAPA installations

Tasks

  1. component/crossplane epic/logging provider/cluster-api-aws topic/capa-ga topic/observability
    QuentinBisson
  2. epic/logging provider/cluster-api-aws team/atlas
    QuentinBisson
  3. epic/logging provider/cluster-api-aws team/atlas
    QuentinBisson
  4. epic/logging provider/cluster-api-aws team/atlas
    QuentinBisson
  5. component/promtail epic/logging team/atlas
    QuentinBisson

How to validate

  • I can read logs from Loki using Grafana on each AWS installation
@QuentinBisson
Copy link

All items are deployed, let's check if we have logs on WCs

@TheoBrigitte
Copy link
Member Author

Works on 2 out of 8 installations

Most of them return NetworkError from Grafana while trying to query for logs. goten has another issue as I can't reach out to Grafana.

  • anzu 🔴
    image
  • gazelle 🟢
  • goat 🔴
    image
  • golem 🟢
  • goten 🔴
    image
  • grizzly 🔴
    image
  • velvet 🔴
    image
  • snail 🔴
    image

@QuentinBisson
Copy link

I fixed the Root cause (incorrect cilium network policy for object-storage-operator). I'll release later. Goten IS ephemeral (mc-bootstrap running,it's not a Real installation)

@QuentinBisson
Copy link

@QuentinBisson
Copy link

Currently only failing on Goat and Anzu, will investigate later

@QuentinBisson
Copy link

Anzu is fixed now.

Goat is working but promtail is failing becuase certificate is not secured:
level=warn ts=2023-11-13T16:50:41.225750028Z caller=client.go:419 component=client host=loki.goat.gaws.gigantic.io msg="error sending batch, will retry" status=-1 tenant=goat error="Post "https://loki.goat.gaws.gigantic.io/loki/api/v1/push\": tls: failed to verify certificate: x509: certificate is valid for ingress.local, not loki.goat.gaws.gigantic.io"

@QuentinBisson
Copy link

Issue for Goat:

#2956

We had this for prometheus agent as well.

@TheoBrigitte should we close this one (i'll let you validate of course) and keep the goat issue only?

@QuentinBisson
Copy link

Goat is fixed when using giantswarm/logging-operator#87

@TheoBrigitte TheoBrigitte removed the needs/refinement Needs refinement in order to be actionable label Nov 14, 2023
@QuentinBisson
Copy link

Loki is now running on CAPA, let's announce that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic/logging provider/cluster-api-aws Cluster API based running on AWS team/atlas Team Atlas
Projects
None yet
Development

No branches or pull requests

3 participants