Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customize attribute that can be use for OpenTelemetry. #178

Open
fadianzkaba opened this issue Feb 13, 2025 · 2 comments
Open

Customize attribute that can be use for OpenTelemetry. #178

fadianzkaba opened this issue Feb 13, 2025 · 2 comments

Comments

@fadianzkaba
Copy link

Hi All

I have been reviewing the telemetry for the Benthos pipeline as described here, but I was unable to find a way to extract custom attributes.

For example, I have a configuration similar to the one shown below. From my understanding, under the hood, payload.go is responsible for extracting the configuration and presenting it in the telemetry data.

Currently, the telemetry output only includes the following attributes:

  • name: output_pipeline_name
  • name: output_cloud_storage

However, I would like to extract additional details from the pipeline_name section, such as dsn , max_in_flight, and organize them under a separate custom attribute.

My goal is to ensure that these attributes are explicitly available in the telemetry data for easier monitoring and debugging. Is there a recommended approach to achieve this? If Redpanda's telemetry pipeline does not support custom attributes by default, are there any workarounds, such as modifying the payload.go implementation or using an alternative configuration method?

Any guidance or best practices on this would be greatly appreciated!

output:
  switch:
    retry_until_success: false
    strict_mode: false
    cases:
      - check: 'metadata("test123") == "abcd"'
        output:
          broker:
            copies: 1
            outputs:
              - pipeline_name:
                  dsn: destination
                  max_in_flight: 50
                  service_account: services_account_id
              - cloud_storage:
                  bucket: bucket_name
                  path: path.json
                  batching:
                    count: 100000
                    period: 5s
                    byte_size: 1000000
                    processors:
                      - bloblang: |
                          root = {}
                          root.message = content().decode("base64")
                          root.AttributeMap = metadata()
                      - archive:
                          format: json_array

And here is the trace configuration

tracer:
  open_telemetry_collector:
    grpc: [{ address: "endpoint", secure: true }]
    tags:
      service.name: the_name_of_services
      deployment.environment: test
    sampling:
      enabled: false
@mihaitodor
Copy link
Collaborator

Hey @fadianzkaba 👋

However, I would like to extract additional details from the pipeline_name section, such as dsn , max_in_flight, and organize them under a separate custom attribute.

Those attributes are part of the config and can't be traversed dynamically via interpolation. However, you can configure any static tags in under open_telemetry_collector. If you want to avoid duplication in the config, you can try leveraging yaml anchors (or even use CUE instead of yaml).

are there any workarounds, such as modifying the payload.go implementation or using an alternative configuration method?

It's not really clear to me what this functionality needs to look like. How would the tracer know which max_in_flight field to emit if you have multiple outputs configured?

Note that the https://github.com/redpanda-data/connect/tree/main/internal/telemetry package is used for sending telemetry information to Redpanda. The open_telemetry_collector is implemented here and calls into the Benthos framework.

@fadianzkaba
Copy link
Author

Thank you @mihaitodor for your reply.

I've add a couple of fields under open_telemetry_collector such as deployment.env, service.namespace however any field gets added under the tags will only appear at the root span instead the child span or attribute that get emitted during depending on the execution process

I've had a good look at the repo https://github.com/redpanda-data/benthos however I can't see if they are using the Telemetry in Red Panda Connect https://github.com/redpanda-data/connect/tree/main/internal/telemetry, I think they are using https://github.com/redpanda-data/benthos/tree/4cd5b09eff3eb816003c69323bbc54942883a1a3/internal/tracing

Still need to dig further on how to pass or add a custom field in the trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants