Skip to content

Fresh install of tobs fails with promscale db password #562

Closed
@lenaxia

Description

@lenaxia

What did you do?
This is a fresh install of tobs into a namespace using helm and fluxcd.

https://github.com/lenaxia/k3s-ops-dev/blob/main/components/apps/base/monitoring/tobs/helm-release.yaml
https://github.com/lenaxia/k3s-ops-dev/blob/main/components/apps/dev/tobs-values.yaml

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: tobs
  namespace: monitoring
spec:
  chart:
    spec:
      version: "12.0.1"
  install:
    createNamespace: true
    remediation:
      retries: 10
  upgrade:
    remediation:
      retries: 10
  values:
    opentelemetry-operator:
      enabled: false

    promscale:
      enabled: true
      #image: timescale/promscale:0.8.0
      service:
        type: LoadBalancer

    timescaledb-single:
      enabled: true
      replicaCount: 1
      loadBalancer:
        enabled: true
      persistentVolumes:
        data:
          size: 11Gi
        wal:
          size: 5Gi
      backup:
        enabled: false
      #env:
      #  PGBACKREST_REPO1_S3_BUCKET
      #  PGBACKREST_REPO1_S3_ENDPOINT
      #  PGBACKREST_REPO1_S3_REGION
      #  PGBACKREST_REPO1_S3_KEY
      #  PGBACKREST_REPO1_S3_KEY_SECRET

    kube-prometheus-stack:
      enabled: true

      alertManager:
        enabled: true
        alertmanagerSpec:
          replicas: 1

      grafana:
        enabled: true

        prometheus:
          datasource:
            enabled: true
        timescale:
          datasource:
            enabled: true

        adminPassword: SOME_PASSWORD_HERE

        ingress:
          enabled: true
          ingressClassName: "traefik"
          annotations:
            hajimari.io/enable: "true"
            hajimari.io/icon: "mdiPlayNetwork"
            #cert-manager.io/cluster-issuer: "letsencrypt-staging"
            cert-manager.io/cluster-issuer: "ca-issuer"
            traefik.ingress.kubernetes.io/router.entrypoints: "websecure"
          hosts:
            - &hostGrafana "grafana.${SECRET_DEV_DOMAIN}"
          tls:
            - hosts:
                - *hostGrafana
              secretName: *hostGrafana

      prometheus:
        prometheusSpec:
          replicas: 1
          scrapeInterval: 1m
          scrapeTimeout: 10s
          evaluationInterval: 1m
          retention: 1d
          storageSpec:
            volumeClaimTemplate:
              spec:
                resources:
                  requests:
                    storage:
                      3Gi

        ingress:
          enabled: true
          ingressClassName: "traefik"
          annotations:
            #cert-manager.io/cluster-issuer: "letsencrypt-staging"
            cert-manager.io/cluster-issuer: "ca-issuer"
            traefik.ingress.kubernetes.io/router.entrypoints: "websecure"
          hosts:
            - &hostProm "prometheus.${SECRET_DEV_DOMAIN}"
          tls:
            - hosts:
                - *hostProm
              secretName: *hostProm

pod/tobs-promscale ends up in a crash loop unable to connect to the timescaledb

level=error ts=2022-08-24T01:36:19.275Z caller=runner.go:116 msg="aborting startup due to error" err="failed to connect to `host=tobs.monitoring.svc user=postgres database=postgres`: server error (FATAL: password authentication failed for user \"postgres\" (SQLSTATE 28P01))"

Did you expect to see some different?

tobs should've installed without issue

Environment

  • tobs version:
spec:
  chart:
    spec:
      version: "12.0.1"
  • Kubernetes version information:

    kubectl version

Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.4", GitCommit:"e6c093d87ea4cbb530a7b2ae91e54c0842d8308a", GitTreeState:"clean", BuildDate:"2022-02-16T12:38:05Z", GoVersion:"go1.17.7", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.7+k3s1", GitCommit:"ac70570999c566ac3507d2cc17369bb0629c1cc0", GitTreeState:"clean", BuildDate:"2021-11-29T16:40:13Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.23) and server (1.21) exceeds the supported minor version skew of +/-1
  • Kubernetes cluster kind:

K3s installed via Flux:

flux version

helm-controller: v0.18.1
kustomize-controller: v0.22.1
notification-controller: v0.23.1
source-controller: v0.22.2

flux check

► checking prerequisites
✗ flux 0.28.2 <0.32.0 (new version is available, please upgrade)
✔ Kubernetes 1.21.7+k3s1 >=1.20.6-0
► checking controllers
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.22.1
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.23.1
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.18.1
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.22.2
✔ all checks passed
  • tobs Logs:

kcl tobs-promscale-788c855fc5-59rqv -n monitoring

level=error ts=2022-08-24T01:36:19.275Z caller=runner.go:116 msg="aborting startup due to error" err="failed to connect to `host=tobs.monitoring.svc user=postgres database=postgres`: server error (FATAL: password authentication failed for user \"postgres\" (SQLSTATE 28P01))"

kc get secret tobs-credentials -n monitoring -o yaml

apiVersion: v1
data:
  PATRONI_REPLICATION_PASSWORD: bFZVRDJkRE9uY05UZm4wVA==
  PATRONI_SUPERUSER_PASSWORD: RXB5VlByYzc2NE15MVQyRg==
  PATRONI_admin_PASSWORD: TGVBSDlKS2lyOUhOVTFqNA==
kind: Secret
metadata:
  annotations:
    helm.sh/hook: pre-install,post-delete
    helm.sh/hook-weight: "0"
    helm.sh/resource-policy: keep
  creationTimestamp: "2022-08-24T01:32:09Z"
  labels:
    app: tobs-timescaledb
    cluster-name: tobs
  name: tobs-credentials
  namespace: monitoring
  resourceVersion: "22929956"
  uid: 2cbf2b64-c369-405f-8472-f85ac5ef289d
type: Opaque

echo RXB5VlByYzc2NE15MVQyRg== | base64 -d

EpyVPrc764My1T2F`

kubectl describe deploy tobs-promscale -n monitoring

Name:               tobs-promscale
Namespace:          monitoring
CreationTimestamp:  Wed, 24 Aug 2022 01:32:22 +0000
Labels:             app=tobs-promscale
                    app.kubernetes.io/component=connector
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=tobs-promscale
                    app.kubernetes.io/version=0.13.0
                    chart=promscale-0.13.0
                    helm.toolkit.fluxcd.io/name=tobs
                    helm.toolkit.fluxcd.io/namespace=monitoring
                    heritage=Helm
                    release=tobs
Annotations:        deployment.kubernetes.io/revision: 1
                    meta.helm.sh/release-name: tobs
                    meta.helm.sh/release-namespace: monitoring
Selector:           app=tobs-promscale,release=tobs
Replicas:           1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType:       Recreate
MinReadySeconds:    0
Pod Template:
  Labels:           app=tobs-promscale
                    app.kubernetes.io/component=connector
                    app.kubernetes.io/name=tobs-promscale
                    app.kubernetes.io/version=0.13.0
                    chart=promscale-0.13.0
                    heritage=Helm
                    release=tobs
  Annotations:      checksum/config: a1171a41877cc559fe699480d7c9bc731055fde6ccbe0b47e5c9a279cfe38962
                    checksum/connection: d610b61926215912316a5f9c07435dd69b06894ed8e640bbd7c2bc21c51a16fa
                    prometheus.io/path: /metrics
                    prometheus.io/port: 9201
                    prometheus.io/scrape: false
  Service Account:  tobs-promscale
  Containers:
   promscale:
    Image:       timescale/promscale:0.13.0
    Ports:       9201/TCP, 9202/TCP
    Host Ports:  0/TCP, 0/TCP
    Args:
      -config=/etc/promscale/config.yaml
      --metrics.high-availability=true
    Requests:
      cpu:      30m
      memory:   500Mi
    Readiness:  http-get http://:metrics-port/healthz delay=0s timeout=15s period=15s #success=1 #failure=3
    Environment Variables from:
      tobs-promscale  Secret  Optional: false
    Environment:
      TOBS_TELEMETRY_INSTALLED_BY:         promscale
      TOBS_TELEMETRY_VERSION:              0.13.0
      TOBS_TELEMETRY_INSTALLED_BY:         helm
      TOBS_TELEMETRY_VERSION:              0.13.0
      TOBS_TELEMETRY_TRACING_ENABLED:      true
      TOBS_TELEMETRY_TIMESCALEDB_ENABLED:  true
    Mounts:
      /etc/promscale/ from configs (rw)
  Volumes:
   configs:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      tobs-promscale
    Optional:  false
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      False   MinimumReplicasUnavailable
  Progressing    True    ReplicaSetUpdated
OldReplicaSets:  <none>
NewReplicaSet:   tobs-promscale-788c855fc5 (1/1 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  7m    deployment-controller  Scaled up replica set tobs-promscale-788c855fc5 to 1

kc get secret tobs-promscale -n monitoring -o yaml

apiVersion: v1
data:
  PROMSCALE_DB_HOST: dG9icy5tb25pdG9yaW5nLnN2Yw==
  PROMSCALE_DB_NAME: cG9zdGdyZXM=
  PROMSCALE_DB_PASSWORD: RXB5VlByYzc2NE15MVQyRg==
  PROMSCALE_DB_PORT: NTQzMg==
  PROMSCALE_DB_SSL_MODE: cmVxdWlyZQ==
  PROMSCALE_DB_USER: cG9zdGdyZXM=
kind: Secret
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"PROMSCALE_DB_HOST":"dG9icy5tb25pdG9yaW5nLnN2Yw==","PROMSCALE_DB_NAME":"cG9zdGdyZXM=","PROMSCALE_DB_PASSWORD":"RXB5VlByYzc2NE15MVQyRg==","PROMSCALE_DB_PORT":"NTQzMg==","PROMSCALE_DB_SSL_MODE":"cmVxdWlyZQ==","PROMSCALE_DB_USER":"cG9zdGdyZXM="},"kind":"Secret","metadata":{"annotations":{"meta.helm.sh/release-name":"tobs","meta.helm.sh/release-namespace":"monitoring"},"creationTimestamp":"2022-08-24T01:32:18Z","labels":{"app":"tobs-promscale","app.kubernetes.io/component":"connector","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"tobs-promscale","app.kubernetes.io/version":"0.13.0","chart":"promscale-0.13.0","helm.toolkit.fluxcd.io/name":"tobs","helm.toolkit.fluxcd.io/namespace":"monitoring","heritage":"Helm","release":"tobs"},"name":"tobs-promscale","namespace":"monitoring","resourceVersion":"22930073","uid":"eb651b96-5c5e-4c79-bfc0-64462bbd0b72"},"type":"Opaque"}
    meta.helm.sh/release-name: tobs
    meta.helm.sh/release-namespace: monitoring
  creationTimestamp: "2022-08-24T01:32:18Z"
  labels:
    app: tobs-promscale
    app.kubernetes.io/component: connector
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: tobs-promscale
    app.kubernetes.io/version: 0.13.0
    chart: promscale-0.13.0
    helm.toolkit.fluxcd.io/name: tobs
    helm.toolkit.fluxcd.io/namespace: monitoring
    heritage: Helm
    release: tobs
  name: tobs-promscale
  namespace: monitoring
  resourceVersion: "22930681"
  uid: eb651b96-5c5e-4c79-bfc0-64462bbd0b72
type: Opaque

echo RXB5VlByYzc2NE15MVQyRg== | base64 -d

EpyVPrc764My1T2F

Anything else we need to know?:

Installing tobs seems to be really unstable, especially with opentelemetry enabled. I've gotten it to install once or twice okay, but shortly thereafter it becomes unhealthy. And now it won't even install anymore.

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions