Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promethues + Thanos setup on K8s - metric retention issue #327

Open
danielstankw opened this issue Dec 30, 2024 · 0 comments
Open

Promethues + Thanos setup on K8s - metric retention issue #327

danielstankw opened this issue Dec 30, 2024 · 0 comments

Comments

@danielstankw
Copy link

danielstankw commented Dec 30, 2024

Hi,
I am running kube-prometheus-stack + kube-thanos.
The system works well, except of the following issue

If I query metric such as kube_pod_info{} I can see the values for past week or so, if I run same query but with label ex. kube_pod_info{cluster='test'} i can only see the data from the past 2 days.

If I check thanos store logs I can see that blocks are being dropped after 2 days

ts=2024-12-28T07:12:57.810458264Z caller=bucket.go:781 level=info msg="loaded new block" elapsed=160.38405ms id=01JG6114MBPFY6PJ4A5R1YTN1J
ts=2024-12-30T07:42:57.742226355Z caller=bucket.go:703 level=info msg="dropped outdated block" block=01JG6114MBPFY6PJ4A5R1YTN1J

QUESTION: How can I modify the setup to allow me to query by labels for longer than 2 days?

thanos-compactor

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app.kubernetes.io/component: database-compactor
    app.kubernetes.io/instance: thanos-compact
    app.kubernetes.io/name: thanos-compact
    app.kubernetes.io/version: v0.36.1
  name: thanos-compact
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: database-compactor
      app.kubernetes.io/instance: thanos-compact
      app.kubernetes.io/name: thanos-compact
  serviceName: thanos-compact
  template:
    metadata:
      labels:
        app.kubernetes.io/component: database-compactor
        app.kubernetes.io/instance: thanos-compact
        app.kubernetes.io/name: thanos-compact
        app.kubernetes.io/version: v0.36.1
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app.kubernetes.io/name
                  operator: In
                  values:
                  - thanos-compact
                - key: app.kubernetes.io/instance
                  operator: In
                  values:
                  - thanos-compact
              namespaces:
              - monitoring
              topologyKey: kubernetes.io/hostname
            weight: 100
      containers:
      - args:
        - compact
        - --wait
        - --log.level=info
        - --log.format=logfmt
        - --objstore.config=$(OBJSTORE_CONFIG)
        - --data-dir=/var/thanos/compact
        - --debug.accept-malformed-index
        - --retention.resolution-raw=0d
        - --retention.resolution-5m=0d
        - --retention.resolution-1h=0d
        - --delete-delay=48h
        - --compact.concurrency=1
        - --downsample.concurrency=1
        - --downsampling.disable
        - --deduplication.replica-label=cluster
        - |-
          --tracing.config="config":
            "sampler_param": 2
            "sampler_type": "ratelimiting"
            "service_name": "thanos-compact"
          "type": "JAEGER"
        env:
        - name: OBJSTORE_CONFIG
          valueFrom:
            secretKeyRef:
              key: thanos-storage-config.yaml
              name: thanos-storage-config
        - name: HOST_IP_ADDRESS
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        image: quay.io/thanos/thanos:v0.36.1
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 4
          httpGet:
            path: /-/healthy
            port: 10902
            scheme: HTTP
          periodSeconds: 30
        name: thanos-compact
        ports:
        - containerPort: 10902
          name: http
        readinessProbe:
          failureThreshold: 20
          httpGet:
            path: /-/ready
            port: 10902
            scheme: HTTP
          periodSeconds: 5
        resources:
          limits:
            cpu: 0.42
            memory: 420Mi
          requests:
            cpu: 0.123
            memory: 123Mi
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /var/thanos/compact
          name: data
          readOnly: false
      nodeSelector:
        kubernetes.io/os: linux
      securityContext:
        fsGroup: 65534
        runAsGroup: 65532
        runAsNonRoot: true
        runAsUser: 65534
        seccompProfile:
          type: RuntimeDefault
      serviceAccountName: thanos-compact
      terminationGracePeriodSeconds: 120
      volumes:
        - emptyDir: {}
          name: data

thankos-store

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app.kubernetes.io/component: object-store-gateway
    app.kubernetes.io/instance: thanos-store
    app.kubernetes.io/name: thanos-store
    app.kubernetes.io/version: v0.36.1
  name: thanos-store
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: object-store-gateway
      app.kubernetes.io/instance: thanos-store
      app.kubernetes.io/name: thanos-store
  serviceName: thanos-store
  template:
    metadata:
      labels:
        app.kubernetes.io/component: object-store-gateway
        app.kubernetes.io/instance: thanos-store
        app.kubernetes.io/name: thanos-store
        app.kubernetes.io/version: v0.36.1
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app.kubernetes.io/name
                  operator: In
                  values:
                  - thanos-store
                - key: app.kubernetes.io/instance
                  operator: In
                  values:
                  - thanos-store
              namespaces:
              - monitoring
              topologyKey: kubernetes.io/hostname
            weight: 100
      containers:
      - args:
        - store
        - --log.level=info
        - --log.format=logfmt
        - --data-dir=/var/thanos/store
        - --grpc-address=0.0.0.0:10901
        - --http-address=0.0.0.0:10902
        - --objstore.config=$(OBJSTORE_CONFIG)
        - --ignore-deletion-marks-delay=168h
        env:
        - name: OBJSTORE_CONFIG
          valueFrom:
            secretKeyRef:
              key: thanos-storage-config.yaml
              name: thanos-storage-config
        - name: HOST_IP_ADDRESS
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        image: quay.io/thanos/thanos:v0.36.1
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 8
          httpGet:
            path: /-/healthy
            port: 10902
            scheme: HTTP
          periodSeconds: 30
          timeoutSeconds: 1
        name: thanos-store
        ports:
        - containerPort: 10901
          name: grpc
        - containerPort: 10902
          name: http
        readinessProbe:
          failureThreshold: 20
          httpGet:
            path: /-/ready
            port: 10902
            scheme: HTTP
          periodSeconds: 5
        resources: {}
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsGroup: 65532
          runAsNonRoot: true
          runAsUser: 65534
          seccompProfile:
            type: RuntimeDefault
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /var/thanos/store
          name: data
          readOnly: false
      nodeSelector:
        kubernetes.io/os: linux
      securityContext:
        fsGroup: 65534
        runAsGroup: 65532
        runAsNonRoot: true
        runAsUser: 65534
        seccompProfile:
          type: RuntimeDefault
      serviceAccountName: thanos-store
      terminationGracePeriodSeconds: 120
      volumes:
        - emptyDir: {}
          name: data
  # volumeClaimTemplates:
  # - metadata:
  #     labels:
  #       app.kubernetes.io/component: object-store-gateway
  #       app.kubernetes.io/instance: thanos-store
  #       app.kubernetes.io/name: thanos-store
  #     name: data
  #   spec:
  #     accessModes:
  #     - ReadWriteOnce
  #     resources:
  #       requests:
  #         storage: 10Gi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant