Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy Alloy with the pyroscope.ebpf component without privilege #1273

Closed
luweglarz opened this issue Jul 12, 2024 · 1 comment
Closed

Deploy Alloy with the pyroscope.ebpf component without privilege #1273

luweglarz opened this issue Jul 12, 2024 · 1 comment
Labels
bug Something isn't working frozen-due-to-age

Comments

@luweglarz
Copy link

What's wrong?

Hello,

I am trying to deploy Alloy with the eBPF component on an OpenShift environment without root access, only using Linux capabilities, I have tried it locally within a docker-compose and it works, however when I try to do it on my OpenShift environment the component doesn't work properly.

The component is reported as healthy on the Alloy dashboard, when I use bpftool can see that the eBPF programs are loaded as well as the maps, however the stack_trace map is always empty.
image

Note that it works when I set the privilege to true, so I assume that maybe I am missing something in the capabilities but I don't know what.

Thanks a lot!

Steps to reproduce

Deploy Alloy with the pyroscope.ebpf component without privilege, only using capabilities based on this discussion.

System information

Linux 5.14.0-284.50.1.el9_2.x86_64

Software version

Grafana Alloy v1.2.0

Configuration

apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-agent-config-pyroscope
  namespace: pyroscope
  labels:
    helm.sh/chart: pyroscope-1.6.0
    app.kubernetes.io/name: pyroscope
    app.kubernetes.io/instance: pyroscope-dev
    app.kubernetes.io/version: "1.6.0"
    app.kubernetes.io/managed-by: Helm
data:
  config.river: |
    logging {
      level  = "info"
      format = "logfmt"
    }

    discovery.kubernetes "local_pods" {
        selectors {
          field = "spec.nodeName=" + env("HOSTNAME")
          role = "pod"
        }
        role = "pod"
      }

    pyroscope.ebpf "instance" {
      forward_to = [pyroscope.write.pyroscope_write.receiver]
      targets = discovery.kubernetes.local_pods.targets
    }

    pyroscope.write "pyroscope_write" {
      endpoint {
        url = "http://pyroscope-dev.pyroscope.svc.cluster.local:4040"
      }
    }

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app.kubernetes.io/instance: pyroscope-dev
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: agent
    app.kubernetes.io/version: v0.36.2
    helm.sh/chart: agent-0.25.0
  name: alloy-dev-agent
  namespace: pyroscope
spec:
  minReadySeconds: 10
  persistentVolumeClaimRetentionPolicy:
    whenDeleted: Retain
    whenScaled: Retain
  podManagementPolicy: Parallel
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/instance: pyroscope-dev
      app.kubernetes.io/name: agent
  serviceName: ""
  template:
    metadata:
      annotations:
        profiles.grafana.com/cpu.port_name: http-metrics
        profiles.grafana.com/cpu.scrape: "true"
        profiles.grafana.com/goroutine.port_name: http-metrics
        profiles.grafana.com/goroutine.scrape: "true"
        profiles.grafana.com/memory.port_name: http-metrics
        profiles.grafana.com/memory.scrape: "true"
      labels:
        app.kubernetes.io/instance: pyroscope-dev
        app.kubernetes.io/name: agent
    spec:
      containers:
      - command:
          - /bin/sh
          - -c
        args:
          - while true; do
            /bin/alloy
            run
            /etc/agent/config.river
            --storage.path=/tmp/agent
            --server.http.listen-addr=:8080
            --stability.level=public-preview;
            sleep 1;
            done
        env:
            - name: ALLOY_DEPLOY_MODE
              value: "helm"
            - name: HOSTNAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
        image: grafana/alloy:v1.2.0
        imagePullPolicy: IfNotPresent
        name: alloy
        ports:
        - containerPort: 80
          name: http-metrics
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /-/ready
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources: {}
        securityContext:
          securityContext:
          capabilities:
            add:
              - SYS_ADMIN
              - BPF 
              - SYS_RESOURCE
          seLinuxOptions:
            type: pyroscope-selinux-profile_pyroscope.process
            level: s0:c33,c22
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/agent
          name: config
      - command:
          - /bin/sh
          - -c
        args:
          - while true; do
            /configmap-reload
            --volume-dir=/etc/agent
            --webhook-url=http://localhost:8080/-/reload;
            sleep 1;
            done
        image: jimmidyson/configmap-reload:v0.8.0
        imagePullPolicy: IfNotPresent
        name: config-reloader
        resources:
          requests:
            cpu: 1m
            memory: 5Mi
        securityContext:
          seLinuxOptions:
            type: container_t
            level: s0:c33,c22
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/agent
          name: config
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: pyroscope-dev-agent
      serviceAccountName: pyroscope-dev-agent
      terminationGracePeriodSeconds: 30
      volumes:
      - configMap:
          defaultMode: 420
          name: grafana-agent-config-pyroscope
        name: config
  updateStrategy:
    rollingUpdate:
      partition: 0
    type: RollingUpdate
---
apiVersion: security-profiles-operator.x-k8s.io/v1alpha2
kind: SelinuxProfile
metadata:
  name: pyroscope-selinux-profile
  namespace: pyroscope
spec:
  allow:
    '@self':
      bpf:
        - map_create
        - map_read
        - map_write
        - prog_load
        - prog_run
      perf_event:
        - open
        - kernel
        - cpu
        - read
        - write
      tcp_socket:
        - listen
    node_t:
      tcp_socket:
      - node_bind
    http_cache_port_t:
      tcp_socket:
        - name_bind
        - name_connect
    http_port_t:
      tcp_socket:
        - name_connect
    unreserved_port_t:
      tcp_socket:
        - name_connect
  inherit:
  - kind: System
    name: container
---
allowHostDirVolumePlugin: false
allowHostIPC: true
allowHostNetwork: false
allowHostPID: true
allowHostPorts: false
allowPrivilegeEscalation: true
allowPrivilegedContainer: false
allowedCapabilities:
  - '*'
# allowedCapabilities:
#   - BPF
#   - SYS_ADMIN
#   - SYS_RESOURCE
#   - SYS_PTRACE
#   - DAC_READ_SEARCH
#   - NET_ADMIN
#   - NET_RAW
#   - PERFMON
#   - IPC_LOCK
apiVersion: security.openshift.io/v1
defaultAddCapabilities: null
fsGroup:
  type: RunAsAny
groups:
- system:cluster-admins
kind: SecurityContextConstraints
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    kubernetes.io/description: copy of anyuid for the pyroscope test
    release.openshift.io/create-only: "true"
  name: pyroscope
priority: 10
readOnlyRootFilesystem: false
requiredDropCapabilities: []
# - MKNOD
runAsUser:
  type: RunAsAny
seccompProfiles:
  - '*'
seLinuxContext:
  type: RunAsAny
supplementalGroups:
  type: RunAsAny
users: []
volumes:
- configMap
- csi
- downwardAPI
- emptyDir
- ephemeral
- persistentVolumeClaim
- projected
- secret

Logs

ts=2024-07-12T09:56:15.50111734Z level=info "boringcrypto enabled"=false
ts=2024-07-12T09:56:15.499882118Z level=info source=/go/pkg/mod/github.com/!kim!machine!gun/[email protected]/memlimit/memlimit.go:176 msg="GOMEMLIMIT is updated" package=github.com/KimMachineGun/automemlimit/memlimit GOMEMLIMIT=188743680
ts=2024-07-12T09:56:15.501168441Z level=info msg="running usage stats reporter"
ts=2024-07-12T09:56:15.501172141Z level=info msg="starting complete graph evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30
ts=2024-07-12T09:56:15.501182241Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=tracing duration=5.101µs
ts=2024-07-12T09:56:15.501190441Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=logging duration=100.701µs
ts=2024-07-12T09:56:15.501209242Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=livedebugging duration=10.3µs
ts=2024-07-12T09:56:15.501220242Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=labelstore duration=2.7µs
ts=2024-07-12T09:56:15.501230442Z level=info msg="applying non-TLS config to HTTP server" service=http
ts=2024-07-12T09:56:15.501235542Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=http duration=8.5µs
ts=2024-07-12T09:56:15.501243242Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=cluster duration=600ns
ts=2024-07-12T09:56:15.501250442Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=ui duration=500ns
ts=2024-07-12T09:56:15.501537548Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=pyroscope.write.pyroscope_write duration=279.705µs
ts=2024-07-12T09:56:15.501969455Z level=info msg="Using pod service account via in-cluster config" component_path=/ component_id=discovery.kubernetes.local_pods
ts=2024-07-12T09:56:15.502356362Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=discovery.kubernetes.local_pods duration=792.914µs
ts=2024-07-12T09:56:15.503107276Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=pyroscope.ebpf.instance duration=727.113µs
ts=2024-07-12T09:56:15.503172377Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=remotecfg duration=41.201µs
ts=2024-07-12T09:56:15.503189578Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 node_id=otel duration=3.701µs
ts=2024-07-12T09:56:15.503197478Z level=info msg="finished complete graph evaluation" controller_path=/ controller_id="" trace_id=7dec913abce7834389a915045582af30 duration=2.249641ms
ts=2024-07-12T09:56:15.50332678Z level=info msg="scheduling loaded components and services"
ts=2024-07-12T09:56:15.503608885Z level=info msg="starting cluster node" peers="" advertise_addr=127.0.0.1:8080
ts=2024-07-12T09:56:15.504737506Z level=info msg="now listening for http traffic" service=http addr=:8080
ts=2024-07-12T09:56:15.504884508Z level=info msg="peers changed" new_peers=alloy-dev-agent-0
@luweglarz luweglarz added the bug Something isn't working label Jul 12, 2024
@luweglarz
Copy link
Author

The issue came from the configuration of my containers, you need to share the host's PID namespace with the agent's container to fix this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working frozen-due-to-age
Projects
None yet
Development

No branches or pull requests

1 participant