Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.17] Add support for listener level warnings #10490

Merged
merged 2 commits into from
Dec 17, 2024

Conversation

davidjumani
Copy link

Description

Backport of #10458

Adds support for listener-level warnings. This way when a listener or its plugin returns an error, it can be checked if it is a configuration error that can be treated as a warning and processed accordingly.

API changes

Added the warnings field to the HttpListenerReport && TcpListenerReport

Context

This is introduced to resolve Upstream not found when configuring opentelemetry collector should be a warning, not an error
TLDR;
When the upstream is not found in a tracing collector, it throws an error instead of a warning (Invalid Destination)

Testing steps

Run the following steps :

kubectl apply -f- << EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: otel-agent-conf
  namespace: gloo-system
  labels:
    app: opentelemetry
    component: otel-agent-conf
data:
  otel-agent-config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    exporters:
      otlp:
        endpoint: "otel-collector.default:4317"
        tls:
          insecure: true
        sending_queue:
          num_consumers: 4
          queue_size: 100
        retry_on_failure:
          enabled: true
    processors:
      batch:
      memory_limiter:
        # 80% of maximum memory up to 2G
        limit_mib: 400
        # 25% of limit up to 2G
        spike_limit_mib: 100
        check_interval: 5s
    extensions:
      zpages: {}
      memory_ballast:
        # Memory Ballast size should be max 1/3 to 1/2 of memory.
        size_mib: 165
    service:
      extensions: [zpages, memory_ballast]
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [otlp]
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: otel-agent
  namespace: gloo-system
  labels:
    app: opentelemetry
    component: otel-agent
spec:
  selector:
    matchLabels:
      app: opentelemetry
      component: otel-agent
  template:
    metadata:
      labels:
        app: opentelemetry
        component: otel-agent
    spec:
      containers:
      - command:
          - "/otelcol"
          - "--config=/conf/otel-agent-config.yaml"
        image: otel/opentelemetry-collector:0.68.0
        name: otel-agent
        resources:
          limits:
            cpu: 500m
            memory: 500Mi
          requests:
            cpu: 100m
            memory: 100Mi
        ports:
        - containerPort: 55679 # ZPages endpoint.
        - containerPort: 4317 # Default OpenTelemetry receiver port.
        - containerPort: 8888  # Metrics.
        volumeMounts:
        - name: otel-agent-config-vol
          mountPath: /conf
      volumes:
        - configMap:
            name: otel-agent-conf
            items:
              - key: otel-agent-config
                path: otel-agent-config.yaml
          name: otel-agent-config-vol
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: otel-collector-conf
  namespace: gloo-system
  labels:
    app: opentelemetry
    component: otel-collector-conf
data:
  otel-collector-config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    processors:
      batch:
      memory_limiter:
        # 80% of maximum memory up to 2G
        limit_mib: 1500
        # 25% of limit up to 2G
        spike_limit_mib: 512
        check_interval: 5s
    extensions:
      zpages: {}
      memory_ballast:
        # Memory Ballast size should be max 1/3 to 1/2 of memory.
        size_mib: 683
    exporters:
      logging:
        loglevel: debug
      zipkin:
        endpoint: "http://zipkin:9411/api/v2/spans"
        tls:
          insecure: true
    service:
      extensions: [zpages, memory_ballast]
      pipelines:
        metrics:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [logging]
        traces:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [logging, zipkin]
---
apiVersion: v1
kind: Service
metadata:
  name: otel-collector
  namespace: gloo-system
  labels:
    app: opentelemetry
    component: otel-collector
spec:
  ports:
  - name: otlp-grpc # Default endpoint for OpenTelemetry gRPC receiver.
    port: 4317
    protocol: TCP
    targetPort: 4317
  - name: otlp-http # Default endpoint for OpenTelemetry HTTP receiver.
    port: 4318
    protocol: TCP
    targetPort: 4318
  - name: metrics # Default endpoint for querying metrics.
    port: 8888
  selector:
    component: otel-collector
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: otel-collector
  namespace: gloo-system
  labels:
    app: opentelemetry
    component: otel-collector
spec:
  selector:
    matchLabels:
      app: opentelemetry
      component: otel-collector
  minReadySeconds: 5
  progressDeadlineSeconds: 120
  replicas: 1 #TODO - adjust this to your own requirements
  template:
    metadata:
      labels:
        app: opentelemetry
        component: otel-collector
    spec:
      containers:
      - command:
          - "/otelcol"
          - "--config=/conf/otel-collector-config.yaml"
        image: otel/opentelemetry-collector:0.68.0
        name: otel-collector
        resources:
          limits:
            cpu: 1
            memory: 2Gi
          requests:
            cpu: 200m
            memory: 400Mi
        ports: # Comment out ports for platforms as needed.
        - containerPort: 55679 # Default endpoint for ZPages.
        - containerPort: 4317 # Default endpoint for OpenTelemetry receiver.
        - containerPort: 14250 # Default endpoint for Jaeger gRPC receiver.
        - containerPort: 14268 # Default endpoint for Jaeger HTTP receiver.
        - containerPort: 9411 # Default endpoint for Zipkin receiver.
        - containerPort: 8888  # Default endpoint for querying metrics.
        volumeMounts:
        - name: otel-collector-config-vol
          mountPath: /conf
      volumes:
        - configMap:
            name: otel-collector-conf
            items:
              - key: otel-collector-config
                path: otel-collector-config.yaml
          name: otel-collector-config-vol
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: echo-server
  name: echo-server
  namespace: gloo-system
spec:
  selector:
    matchLabels:
      app: echo-server
  replicas: 1
  template:
    metadata:
      labels:
        app: echo-server
    spec:
      containers:
      # remember to change port to 80 and disable http2 if using httpbin
      # - image: kennethreitz/httpbin
      - image: jmalloc/echo-server
        name: echo-server
        env:
        - name: LOG_HTTP_HEADERS
          value: "true"
        - name: LOG_HTTP_BODY
          value: "true"
        ports:
        - containerPort: 8080
          name: grpc
---
apiVersion: v1
kind: Service
metadata:
  name: echo-server
  namespace: gloo-system
  labels:
    service: echo-server
spec:
  ports:
  - port: 8080
    # targetPort: 80
    protocol: TCP
  selector:
    app: echo-server
---
# gloo resources
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
  name: echo-server
  namespace: gloo-system
spec:
  useHttp2: true
  static:
    hosts:
    - addr: echo-server
      port: 8080
---
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
  name: "opentelemetry-collector"
  namespace: gloo-system
spec:
  # REQUIRED FOR OPENTELEMETRY COLLECTION
  useHttp2: true
  kube:
    # selector:
    serviceName: otel-collector
    serviceNamespace: gloo-system
    servicePort: 4317
---
apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: default
  namespace: gloo-system
spec:
  virtualHost:
    domains:
    - '*'
    options:
      stagedTransformations:
        regular:
          requestTransforms:
            - requestTransformation:
                transformationTemplate:
                  headers:
                    test_header:
                      text: '{{header("ABCD")}}'
                  spanTransformer:
                    name:
                      text: '{{header("Host")}}'
    routes:
    - matchers:
       - prefix: /route1
      routeAction:
        single:
          upstream:
            name: echo-server
            namespace: gloo-system
      options:
        autoHostRewrite: true
    - matchers:
       - prefix: /route2
      routeAction:
        single:
          upstream:
            name: echo-server
            namespace: gloo-system
      options:
        autoHostRewrite: true
    - matchers:
       - prefix: /
      routeAction:
        single:
          upstream:
            name: echo-server
            namespace: gloo-system
      options:
        autoHostRewrite: true
EOF

Now create a gateway with an invalid otel collector upstream :

kubectl apply -f- << EOF
apiVersion: gateway.solo.io/v1
kind: Gateway
metadata:
  labels:
    app: gloo
  name: gateway-proxy
  namespace: gloo-system
spec:
  bindAddress: '::'
  bindPort: 8080
  httpGateway:
    options:
      httpConnectionManagerSettings:
        tracing:
          openTelemetryConfig:
            collectorUpstreamRef:
              namespace: gloo-system
              name: opentelemetry-collectorv2
EOF

After this fix, the gateway should be accepted with a warning

kubectl -n gloo-system get gateway.gateway.solo.io/gateway-proxy -o yaml
apiVersion: gateway.solo.io/v1
kind: Gateway
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"gateway.solo.io/v1","kind":"Gateway","metadata":{"annotations":{},"labels":{"app":"gloo"},"name":"gateway-proxy","namespace":"gloo-system"},"spec":{"bindAddress":"::","bindPort":8080,"httpGateway":{"options":{"httpConnectionManagerSettings":{"tracing":{"openTelemetryConfig":{"collectorUpstreamRef":{"name":"opentelemetry-collectorv2","namespace":"gloo-system"}}}}}}}}
  creationTimestamp: "2024-12-09T13:40:55Z"
  generation: 7
  labels:
    app: gloo
  name: gateway-proxy
  namespace: gloo-system
  resourceVersion: "53410"
  uid: da348bab-0334-4065-9ec5-4c6f5ac5d7d1
spec:
  bindAddress: '::'
  bindPort: 8080
  httpGateway:
    options:
      httpConnectionManagerSettings:
        tracing:
          openTelemetryConfig:
            collectorUpstreamRef:
              name: opentelemetry-collectorv2
              namespace: gloo-system
status:
  statuses:   << See the warning here
    gloo-system:
      reason: "warning: \n  HttpListener Warning: InvalidDestinationWarning. Reason:
        *v1.Upstream { gloo-system.opentelemetry-collectorv2 } not found"
      reportedBy: gloo
      state: Warning

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

Co-authored-by: changelog-bot <changelog-bot>
Co-authored-by: Sam Heilbron <[email protected]>
@davidjumani davidjumani requested a review from a team as a code owner December 16, 2024 12:53
@solo-changelog-bot
Copy link

Issues linked to changelog:
kgateway-dev#10293

@davidjumani davidjumani changed the title [1.17] Add support for listener level warnings (#10458) [1.17] Add support for listener level warnings Dec 16, 2024
@soloio-bulldozer soloio-bulldozer bot merged commit 64950d8 into v1.17.x Dec 17, 2024
17 checks passed
@soloio-bulldozer soloio-bulldozer bot deleted the allow-listener-warnings-117 branch December 17, 2024 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants