Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dashboard forwarding from cos-configuration-k8s can be unreliable #312

Open
Batalex opened this issue Mar 22, 2024 · 1 comment
Open

Dashboard forwarding from cos-configuration-k8s can be unreliable #312

Batalex opened this issue Mar 22, 2024 · 1 comment

Comments

@Batalex
Copy link

Batalex commented Mar 22, 2024

Bug Description

I have an issue with custom dashboards from cos-configuration-k8s not appearing in the grafana interface.

I managed to pinpoint the source of the issue to this charm because the custom dashboards are present in the relation databag, as well as in the grafana container.

juju ssh --container grafana grafana/0 ls -1 /etc/grafana/provisioning/dashboards

default.yaml
juju_alertmanager-k8s_e9224b0.json
juju_cos-configuration-k8s_043a2b3.json
juju_cos-configuration-k8s_af3132d.json
juju_grafana-agent_0def0c2.json
juju_grafana-agent_6545430.json
juju_grafana-agent_ab32508.json
juju_grafana-agent_feefa09.json
juju_loki-k8s_0804127.json
juju_prometheus-k8s_35dd368.json
self_dashboard.json

See that two cos-config files are present in the output above, but they do not appear in grafana.

I can sometimes address the issue by scaling up and down grafana, but this operation is not a sure fix

To Reproduce

I have not been able to find a way to consistently reproduce the issue. However, in all case, I would have multiple grafana agents related to the monitoring stack.

COS - juju export bundle
bundle: kubernetes
saas:
  remote-8ae57c5a420b4e8c889fd8eba6c28be9: {}
  remote-57789c2419f64cb8874a0822ebaa787b: {}
applications:
  alertmanager:
    charm: alertmanager-k8s
    channel: stable
    revision: 101
    resources:
      alertmanager-image: 87
    scale: 1
    constraints: arch=amd64
    storage:
      data: kubernetes,1,2048M
    trust: true
  catalogue:
    charm: catalogue-k8s
    channel: stable
    revision: 33
    resources:
      catalogue-image: 32
    scale: 1
    options:
      description: "Canonical Observability Stack Lite, or COS Lite, is a light-weight,
        highly-integrated, \nJuju-based observability suite running on Kubernetes.\n"
      tagline: Model-driven Observability Stack deployed with a single command.
      title: Canonical Observability Stack
    constraints: arch=amd64
    trust: true
  cos-configuration-k8s:
    charm: cos-configuration-k8s
    channel: stable
    revision: 45
    resources:
      git-sync-image: 32
    scale: 1
    options:
      git_branch: main
      git_repo: https://github.com/batalex/cos-rules
      grafana_dashboards_path: grafana/dashboards/
      prometheus_alert_rules_path: rules/
    constraints: arch=amd64
    storage:
      content-from-git: kubernetes,1,1024M
    trust: true
  grafana:
    charm: grafana-k8s
    channel: stable
    revision: 105
    resources:
      grafana-image: 68
      litestream-image: 43
    scale: 1
    constraints: arch=amd64
    storage:
      database: kubernetes,1,2048M
    trust: true
  loki:
    charm: loki-k8s
    channel: stable
    revision: 118
    resources:
      loki-image: 91
    scale: 1
    constraints: arch=amd64
    storage:
      active-index-directory: kubernetes,1,2048M
      loki-chunks: kubernetes,1,10240M
    trust: true
  prometheus:
    charm: prometheus-k8s
    channel: stable
    revision: 170
    resources:
      prometheus-image: 139
    scale: 1
    constraints: arch=amd64
    storage:
      database: kubernetes,1,10240M
    trust: true
  traefik:
    charm: traefik-k8s
    channel: stable
    revision: 169
    resources:
      traefik-image: 158
    scale: 1
    constraints: arch=amd64
    storage:
      configurations: kubernetes,1,1024M
    trust: true
relations:
- - traefik:ingress-per-unit
  - prometheus:ingress
- - traefik:ingress-per-unit
  - loki:ingress
- - traefik:traefik-route
  - grafana:ingress
- - traefik:ingress
  - alertmanager:ingress
- - prometheus:alertmanager
  - alertmanager:alerting
- - grafana:grafana-source
  - prometheus:grafana-source
- - grafana:grafana-source
  - loki:grafana-source
- - grafana:grafana-source
  - alertmanager:grafana-source
- - loki:alertmanager
  - alertmanager:alerting
- - prometheus:metrics-endpoint
  - traefik:metrics-endpoint
- - prometheus:metrics-endpoint
  - alertmanager:self-metrics-endpoint
- - prometheus:metrics-endpoint
  - loki:metrics-endpoint
- - prometheus:metrics-endpoint
  - grafana:metrics-endpoint
- - grafana:grafana-dashboard
  - loki:grafana-dashboard
- - grafana:grafana-dashboard
  - prometheus:grafana-dashboard
- - grafana:grafana-dashboard
  - alertmanager:grafana-dashboard
- - catalogue:ingress
  - traefik:ingress
- - catalogue:catalogue
  - grafana:catalogue
- - catalogue:catalogue
  - prometheus:catalogue
- - catalogue:catalogue
  - alertmanager:catalogue
- - grafana:grafana-dashboard
  - remote-57789c2419f64cb8874a0822ebaa787b:grafana-dashboards-provider
- - loki:logging
  - remote-57789c2419f64cb8874a0822ebaa787b:logging-consumer
- - prometheus:receive-remote-write
  - remote-57789c2419f64cb8874a0822ebaa787b:send-remote-write
- - grafana:grafana-dashboard
  - remote-8ae57c5a420b4e8c889fd8eba6c28be9:grafana-dashboards-provider
- - loki:logging
  - remote-8ae57c5a420b4e8c889fd8eba6c28be9:logging-consumer
- - prometheus:receive-remote-write
  - remote-8ae57c5a420b4e8c889fd8eba6c28be9:send-remote-write
- - cos-configuration-k8s:grafana-dashboards
  - grafana:grafana-dashboard
- - cos-configuration-k8s:prometheus-config
  - prometheus:metrics-endpoint
--- # overlay.yaml
applications:
  alertmanager:
    offers:
      alertmanager-karma-dashboard:
        endpoints:
        - karma-dashboard
        acl:
          admin: admin
  grafana:
    offers:
      grafana-dashboards:
        endpoints:
        - grafana-dashboard
        acl:
          admin: admin
  loki:
    offers:
      loki-logging:
        endpoints:
        - logging
        acl:
          admin: admin
  prometheus:
    offers:
      prometheus-receive-remote-write:
        endpoints:
        - receive-remote-write
        acl:
          admin: admin

Environment

  • multipass using charm-dev blueprint
  • juju version 3.1.7
  • COS stack deployed using cos-lite bundle

Relevant log output

unit-grafana-0: 14:44:51 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-grafana-0: 14:49:04 WARNING unit.grafana/0.juju-log <class '__main__.GrafanaCharm'>.<property object at 0x7f8f52db4090> returned None; continuing with tracing DISABLED.
unit-grafana-0: 14:49:05 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-grafana-0: 14:53:29 WARNING unit.grafana/0.juju-log <class '__main__.GrafanaCharm'>.<property object at 0x7fc7421142c0> returned None; continuing with tracing DISABLED.
unit-grafana-0: 14:53:29 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-grafana-0: 14:57:51 WARNING unit.grafana/0.juju-log <class '__main__.GrafanaCharm'>.<property object at 0x7fa9c039e220> returned None; continuing with tracing DISABLED.
unit-grafana-0: 14:57:52 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-grafana-0: 15:02:21 WARNING unit.grafana/0.juju-log <class '__main__.GrafanaCharm'>.<property object at 0x7fd1153ac2c0> returned None; continuing with

Additional context

No response

@lucabello
Copy link
Contributor

We are probably missing a restart on that hook!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants