Description
What did you do?
This is a fresh install of tobs into a namespace using helm and fluxcd.
https://github.com/lenaxia/k3s-ops-dev/blob/main/components/apps/base/monitoring/tobs/helm-release.yaml
https://github.com/lenaxia/k3s-ops-dev/blob/main/components/apps/dev/tobs-values.yaml
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: tobs
namespace: monitoring
spec:
chart:
spec:
version: "12.0.1"
install:
createNamespace: true
remediation:
retries: 10
upgrade:
remediation:
retries: 10
values:
opentelemetry-operator:
enabled: false
promscale:
enabled: true
#image: timescale/promscale:0.8.0
service:
type: LoadBalancer
timescaledb-single:
enabled: true
replicaCount: 1
loadBalancer:
enabled: true
persistentVolumes:
data:
size: 11Gi
wal:
size: 5Gi
backup:
enabled: false
#env:
# PGBACKREST_REPO1_S3_BUCKET
# PGBACKREST_REPO1_S3_ENDPOINT
# PGBACKREST_REPO1_S3_REGION
# PGBACKREST_REPO1_S3_KEY
# PGBACKREST_REPO1_S3_KEY_SECRET
kube-prometheus-stack:
enabled: true
alertManager:
enabled: true
alertmanagerSpec:
replicas: 1
grafana:
enabled: true
prometheus:
datasource:
enabled: true
timescale:
datasource:
enabled: true
adminPassword: SOME_PASSWORD_HERE
ingress:
enabled: true
ingressClassName: "traefik"
annotations:
hajimari.io/enable: "true"
hajimari.io/icon: "mdiPlayNetwork"
#cert-manager.io/cluster-issuer: "letsencrypt-staging"
cert-manager.io/cluster-issuer: "ca-issuer"
traefik.ingress.kubernetes.io/router.entrypoints: "websecure"
hosts:
- &hostGrafana "grafana.${SECRET_DEV_DOMAIN}"
tls:
- hosts:
- *hostGrafana
secretName: *hostGrafana
prometheus:
prometheusSpec:
replicas: 1
scrapeInterval: 1m
scrapeTimeout: 10s
evaluationInterval: 1m
retention: 1d
storageSpec:
volumeClaimTemplate:
spec:
resources:
requests:
storage:
3Gi
ingress:
enabled: true
ingressClassName: "traefik"
annotations:
#cert-manager.io/cluster-issuer: "letsencrypt-staging"
cert-manager.io/cluster-issuer: "ca-issuer"
traefik.ingress.kubernetes.io/router.entrypoints: "websecure"
hosts:
- &hostProm "prometheus.${SECRET_DEV_DOMAIN}"
tls:
- hosts:
- *hostProm
secretName: *hostProm
pod/tobs-promscale ends up in a crash loop unable to connect to the timescaledb
level=error ts=2022-08-24T01:36:19.275Z caller=runner.go:116 msg="aborting startup due to error" err="failed to connect to `host=tobs.monitoring.svc user=postgres database=postgres`: server error (FATAL: password authentication failed for user \"postgres\" (SQLSTATE 28P01))"
Did you expect to see some different?
tobs should've installed without issue
Environment
- tobs version:
spec:
chart:
spec:
version: "12.0.1"
-
Kubernetes version information:
kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.4", GitCommit:"e6c093d87ea4cbb530a7b2ae91e54c0842d8308a", GitTreeState:"clean", BuildDate:"2022-02-16T12:38:05Z", GoVersion:"go1.17.7", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.7+k3s1", GitCommit:"ac70570999c566ac3507d2cc17369bb0629c1cc0", GitTreeState:"clean", BuildDate:"2021-11-29T16:40:13Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.23) and server (1.21) exceeds the supported minor version skew of +/-1
- Kubernetes cluster kind:
K3s installed via Flux:
flux version
helm-controller: v0.18.1
kustomize-controller: v0.22.1
notification-controller: v0.23.1
source-controller: v0.22.2
flux check
► checking prerequisites
✗ flux 0.28.2 <0.32.0 (new version is available, please upgrade)
✔ Kubernetes 1.21.7+k3s1 >=1.20.6-0
► checking controllers
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.22.1
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.23.1
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.18.1
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.22.2
✔ all checks passed
- tobs Logs:
kcl tobs-promscale-788c855fc5-59rqv -n monitoring
level=error ts=2022-08-24T01:36:19.275Z caller=runner.go:116 msg="aborting startup due to error" err="failed to connect to `host=tobs.monitoring.svc user=postgres database=postgres`: server error (FATAL: password authentication failed for user \"postgres\" (SQLSTATE 28P01))"
kc get secret tobs-credentials -n monitoring -o yaml
apiVersion: v1
data:
PATRONI_REPLICATION_PASSWORD: bFZVRDJkRE9uY05UZm4wVA==
PATRONI_SUPERUSER_PASSWORD: RXB5VlByYzc2NE15MVQyRg==
PATRONI_admin_PASSWORD: TGVBSDlKS2lyOUhOVTFqNA==
kind: Secret
metadata:
annotations:
helm.sh/hook: pre-install,post-delete
helm.sh/hook-weight: "0"
helm.sh/resource-policy: keep
creationTimestamp: "2022-08-24T01:32:09Z"
labels:
app: tobs-timescaledb
cluster-name: tobs
name: tobs-credentials
namespace: monitoring
resourceVersion: "22929956"
uid: 2cbf2b64-c369-405f-8472-f85ac5ef289d
type: Opaque
echo RXB5VlByYzc2NE15MVQyRg== | base64 -d
EpyVPrc764My1T2F`
kubectl describe deploy tobs-promscale -n monitoring
Name: tobs-promscale
Namespace: monitoring
CreationTimestamp: Wed, 24 Aug 2022 01:32:22 +0000
Labels: app=tobs-promscale
app.kubernetes.io/component=connector
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=tobs-promscale
app.kubernetes.io/version=0.13.0
chart=promscale-0.13.0
helm.toolkit.fluxcd.io/name=tobs
helm.toolkit.fluxcd.io/namespace=monitoring
heritage=Helm
release=tobs
Annotations: deployment.kubernetes.io/revision: 1
meta.helm.sh/release-name: tobs
meta.helm.sh/release-namespace: monitoring
Selector: app=tobs-promscale,release=tobs
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: Recreate
MinReadySeconds: 0
Pod Template:
Labels: app=tobs-promscale
app.kubernetes.io/component=connector
app.kubernetes.io/name=tobs-promscale
app.kubernetes.io/version=0.13.0
chart=promscale-0.13.0
heritage=Helm
release=tobs
Annotations: checksum/config: a1171a41877cc559fe699480d7c9bc731055fde6ccbe0b47e5c9a279cfe38962
checksum/connection: d610b61926215912316a5f9c07435dd69b06894ed8e640bbd7c2bc21c51a16fa
prometheus.io/path: /metrics
prometheus.io/port: 9201
prometheus.io/scrape: false
Service Account: tobs-promscale
Containers:
promscale:
Image: timescale/promscale:0.13.0
Ports: 9201/TCP, 9202/TCP
Host Ports: 0/TCP, 0/TCP
Args:
-config=/etc/promscale/config.yaml
--metrics.high-availability=true
Requests:
cpu: 30m
memory: 500Mi
Readiness: http-get http://:metrics-port/healthz delay=0s timeout=15s period=15s #success=1 #failure=3
Environment Variables from:
tobs-promscale Secret Optional: false
Environment:
TOBS_TELEMETRY_INSTALLED_BY: promscale
TOBS_TELEMETRY_VERSION: 0.13.0
TOBS_TELEMETRY_INSTALLED_BY: helm
TOBS_TELEMETRY_VERSION: 0.13.0
TOBS_TELEMETRY_TRACING_ENABLED: true
TOBS_TELEMETRY_TIMESCALEDB_ENABLED: true
Mounts:
/etc/promscale/ from configs (rw)
Volumes:
configs:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: tobs-promscale
Optional: false
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing True ReplicaSetUpdated
OldReplicaSets: <none>
NewReplicaSet: tobs-promscale-788c855fc5 (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 7m deployment-controller Scaled up replica set tobs-promscale-788c855fc5 to 1
kc get secret tobs-promscale -n monitoring -o yaml
apiVersion: v1
data:
PROMSCALE_DB_HOST: dG9icy5tb25pdG9yaW5nLnN2Yw==
PROMSCALE_DB_NAME: cG9zdGdyZXM=
PROMSCALE_DB_PASSWORD: RXB5VlByYzc2NE15MVQyRg==
PROMSCALE_DB_PORT: NTQzMg==
PROMSCALE_DB_SSL_MODE: cmVxdWlyZQ==
PROMSCALE_DB_USER: cG9zdGdyZXM=
kind: Secret
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","data":{"PROMSCALE_DB_HOST":"dG9icy5tb25pdG9yaW5nLnN2Yw==","PROMSCALE_DB_NAME":"cG9zdGdyZXM=","PROMSCALE_DB_PASSWORD":"RXB5VlByYzc2NE15MVQyRg==","PROMSCALE_DB_PORT":"NTQzMg==","PROMSCALE_DB_SSL_MODE":"cmVxdWlyZQ==","PROMSCALE_DB_USER":"cG9zdGdyZXM="},"kind":"Secret","metadata":{"annotations":{"meta.helm.sh/release-name":"tobs","meta.helm.sh/release-namespace":"monitoring"},"creationTimestamp":"2022-08-24T01:32:18Z","labels":{"app":"tobs-promscale","app.kubernetes.io/component":"connector","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"tobs-promscale","app.kubernetes.io/version":"0.13.0","chart":"promscale-0.13.0","helm.toolkit.fluxcd.io/name":"tobs","helm.toolkit.fluxcd.io/namespace":"monitoring","heritage":"Helm","release":"tobs"},"name":"tobs-promscale","namespace":"monitoring","resourceVersion":"22930073","uid":"eb651b96-5c5e-4c79-bfc0-64462bbd0b72"},"type":"Opaque"}
meta.helm.sh/release-name: tobs
meta.helm.sh/release-namespace: monitoring
creationTimestamp: "2022-08-24T01:32:18Z"
labels:
app: tobs-promscale
app.kubernetes.io/component: connector
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: tobs-promscale
app.kubernetes.io/version: 0.13.0
chart: promscale-0.13.0
helm.toolkit.fluxcd.io/name: tobs
helm.toolkit.fluxcd.io/namespace: monitoring
heritage: Helm
release: tobs
name: tobs-promscale
namespace: monitoring
resourceVersion: "22930681"
uid: eb651b96-5c5e-4c79-bfc0-64462bbd0b72
type: Opaque
echo RXB5VlByYzc2NE15MVQyRg== | base64 -d
EpyVPrc764My1T2F
Anything else we need to know?:
Installing tobs seems to be really unstable, especially with opentelemetry enabled. I've gotten it to install once or twice okay, but shortly thereafter it becomes unhealthy. And now it won't even install anymore.