Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to set otel fields from the event body based on keys. Sig… #8644

Merged
merged 2 commits into from
Apr 10, 2024

Conversation

cb645j
Copy link
Contributor

@cb645j cb645j commented Mar 27, 2024

…ned-off-by: Cory Boslet [email protected]

This is to support sending additional otlp properties derived from log events body/message properties based on keys.
This is related to #8359 and #8552

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • N/A Run local packaging test showing all targets (including any new ones) build.
  • N/A Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • N/A Backport to latest stable release.

Debug log ouput

stdout:
[0] kubernetes.local: [[1711471800.517000000, {}], {"timestamp"=>"2024-03-26 11:50:00.517", "loglevel"=>"INFO", "trace_id"=>"95e1d11ece6460e7d00c61d45cc195ff", "span_id"=>"11aafe22712ca02c", "message"=>"A log message"}]

outgoing opentelemetry message
see @sudomateo comment below

Example Configuration

    [INPUT]
      Name   dummy
      Dummy {"message": "A log message", "span_id": "11aafe22712ca02c", "trace_id": "95e1d11ece6460e7d00c61d45cc195ff", "loglevel": "INFO"}

    [OUTPUT]
        Name stdout
        Log_Level trace
        Match *

    [OUTPUT]
        Name opentelemetry
        Match *
        Log_Level trace
        Host ingest.frontend.com
        Port 443
        Log_response_payload True
        logs_body_key $message
        logs_span_id_message_key span_id
        logs_trace_id_message_key trace_id
        logs_severity_text_message_key loglevel
        Tls                  On
        Tls.verify           Off

Documentation

Key Description Default
logs_span_id_message_key Specify a Span Id key to look up in the log events body/message SpanId
logs_trace_id_message_key Specify a Trace Id key to look up in the log events body/message TraceId
logs_severity_text_message_key Specify a Severity Text key to look up in the log events body/message SeverityText
logs_severity_number_message_key Specify a Severity Number key to look up in the log events body/message SeverityNumber

Otel Log Data Model fields now supported:

  • Timestamp (already supported)
  • ObserveredTimestamp
  • TraceId
  • SpanId
  • TraceFlags
  • SeverityText
  • SeverityNumber
  • Resource
  • InstrumentationScope
  • Attributes (plan to add on separate pr)

Valgrind

valgrind --leak-check=yes fluent-bit --config fluent-bit-config.yaml
==16653== Memcheck, a memory error detector
==16653== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16653== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==16653== Command: ./projects/scratch/fluent-bit/build/bin/fluent-bit --config fluent-bit-config.yaml
==16653==

[2024/03/28 17:39:51] [ info] [fluent bit] version=3.0.1, commit=02b1a89090, pid=16653
[2024/03/28 17:39:51] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/03/28 17:39:51] [ info] [cmetrics] version=0.7.0
[2024/03/28 17:39:51] [ info] [output:stdout:stdout.0] worker #0 started
[2024/03/28 17:39:51] [ info] [ctraces ] version=0.4.0
[2024/03/28 17:39:51] [ info] [input:dummy:dummy.0] initializing
[2024/03/28 17:39:51] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2024/03/28 17:39:51] [ info] [sp] stream processor started
[{"date":1711647592.169228,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647593.161132,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647594.156564,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647595.153959,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647596.163722,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647597.154836,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647598.153678,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647599.152517,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
^C[2024/03/28 17:40:01] [engine] caught signal (SIGINT)
[2024/03/28 17:40:01] [ warn] [engine] service will shutdown in max 5 seconds
[{"date":1711647600.155777,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[2024/03/28 17:40:01] [ info] [input] pausing dummy.0
[{"date":1711647601.156709,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[2024/03/28 17:40:02] [ info] [engine] service has stopped (0 pending tasks)
[2024/03/28 17:40:02] [ info] [input] pausing dummy.0
[2024/03/28 17:40:02] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2024/03/28 17:40:02] [ info] [output:stdout:stdout.0] thread worker #0 stopped
==16653==
==16653== HEAP SUMMARY:
==16653== in use at exit: 0 bytes in 0 blocks
==16653== total heap usage: 2,053 allocs, 2,053 frees, 5,611,479 bytes allocated
==16653==
==16653== All heap blocks were freed -- no leaks are possible
==16653==
==16653== For lists of detected and suppressed errors, rerun with: -s
==16653== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@cb645j
Copy link
Contributor Author

cb645j commented Mar 27, 2024

@nokute78 @sudomateo @edsiper

@sudomateo
Copy link
Contributor

Thank you for working on this. I was able to compile this branch locally and test it. Here are my results.

I started OpenTelemetry and Fluent Bit with the following respective configurations.

otel-config.yaml
---
receivers:
  otlp:
    protocols:
      grpc:
      http:
        endpoint: "0.0.0.0:4318"

processors:
  batch:

exporters:
  file:
    path: otel-output.json

service:
  telemetry:
    logs:
      level: debug
  pipelines:
    logs:
      receivers:
        - otlp
      processors:
        - batch
      exporters:
        - file
fluent-bit-config.yaml
---
pipeline:
  inputs:
    - name: dummy
      tag: dummy
      metadata: |
        {
          "resource-attr": "resource-attr-val-1"
        }
      # This log record is taken directly from OpenTelemetry's Examples page.
      # https://opentelemetry.io/docs/specs/otel/protocol/file-exporter/#examples
      dummy: |
        {
          "severityNumber": 9,
          "severityText": "Info",
          "name": "logA",
          "message": "This is a log message",
          "app": "server",
          "instance_num": 1,
          "traceId": "08040201000000000000000000000000",
          "spanId": "0102040800000000"
        }

  outputs:
    - name: opentelemetry
      match: "*"
      host: localhost
      port: 4318
      logs_body_key: $message
      logs_span_id_message_key: spanId
      logs_trace_id_message_key: traceId

    # For live debugging on the Fluent Bit side.
    - name: stdout
      match: "*"
      format: json

Here's what I received in OpenTelemetry. The span ID and trace ID are successfully populated.

actual-logs.json
{
  "resourceLogs": [
    {
      "resource": {},
      "scopeLogs": [
        {
          "scope": {},
          "logRecords": [
            {
              "timeUnixNano": "1711562977179253000",
              "body": {
                "stringValue": "This is a log message"
              },
              "attributes": [
                {
                  "key": "resource-attr",
                  "value": {
                    "stringValue": "resource-attr-val-1"
                  }
                }
              ],
              "traceId": "08040201000000000000000000000000",
              "spanId": "0102040800000000"
            }
          ]
        }
      ]
    }
  ]
}

@cb645j
Copy link
Contributor Author

cb645j commented Mar 27, 2024

Thank you for working on this. I was able to compile this branch locally and test it. Here are my results.

I started OpenTelemetry and Fluent Bit with the following respective configurations.

otel-config.yaml
fluent-bit-config.yaml
Here's what I received in OpenTelemetry. The span ID and trace ID are successfully populated.

actual-logs.json

Thank you @sudomateo for testing. Im glad its working for you as well. I also added support for extracting and setting severity_text and severity_number if you would like to send those also. I would like to be able to also set attributes based on a list of keys the user configures however that work will be done in a separate PR. For now if you would like to set attributes then you can by utilizing the "logs_body_key_attributes true" config that was done by @edsiper last month. The only downside to using the existing logs_body_key_attributes is that it copies over all keys into the attributes field so if you only want to send select fields/keys as attributes, thats currently not possible. Thats why I think it would be good to have a logs_severity_text_attributes_keys configuration at some point, but like i said that will be done in a separate PR.

If you have time can you please run and attach a valgrind output showing no memory leaks. Im not able to do this on windows.

@sudomateo
Copy link
Contributor

I tested logs_severity_text_message_key with the same input and received the following warning.

[2024/03/27 20:35:16] [ warn] [output:opentelemetry:opentelemetry.0] Unable to process severityText. Invalid Severity Text.

It appears the is_valid_severity_text function needs to be updated to be case-insensitive.

I didn't have time to use valgrind, sorry!

@cb645j
Copy link
Contributor Author

cb645j commented Mar 28, 2024

Thanks @sudomateo . I agree that it should be case insensitive. I will update.

@robododge
Copy link

Hi,

Thanks for putting in this OpentTelemetry change. I really need this as well. It looks like Valgrind is missing still. yeah, I don't know how to setup and use Valgrind for this project, but hope someone else knows.

@sudomateo
Copy link
Contributor

Here's a quick valgrind report.

valgrind --leak-check=yes fluent-bit --config fluent-bit-config.yaml
==16653== Memcheck, a memory error detector
==16653== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16653== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==16653== Command: ./projects/scratch/fluent-bit/build/bin/fluent-bit --config fluent-bit-config.yaml
==16653==
Fluent Bit v3.0.1
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

___________.__                        __    __________.__  __          ________
\_   _____/|  |  __ __   ____   _____/  |_  \______   \__|/  |_  ___  _\_____  \
 |    __)  |  | |  |  \_/ __ \ /    \   __\  |    |  _/  \   __\ \  \/ / _(__  <
 |     \   |  |_|  |  /\  ___/|   |  \  |    |    |   \  ||  |    \   / /       \
 \___  /   |____/____/  \___  >___|  /__|    |______  /__||__|     \_/ /______  /
     \/                     \/     \/               \/                        \/

[2024/03/28 17:39:51] [ info] [fluent bit] version=3.0.1, commit=02b1a89090, pid=16653
[2024/03/28 17:39:51] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/03/28 17:39:51] [ info] [cmetrics] version=0.7.0
[2024/03/28 17:39:51] [ info] [output:stdout:stdout.0] worker #0 started
[2024/03/28 17:39:51] [ info] [ctraces ] version=0.4.0
[2024/03/28 17:39:51] [ info] [input:dummy:dummy.0] initializing
[2024/03/28 17:39:51] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2024/03/28 17:39:51] [ info] [sp] stream processor started
[{"date":1711647592.169228,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647593.161132,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647594.156564,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647595.153959,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647596.163722,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647597.154836,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647598.153678,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[{"date":1711647599.152517,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
^C[2024/03/28 17:40:01] [engine] caught signal (SIGINT)
[2024/03/28 17:40:01] [ warn] [engine] service will shutdown in max 5 seconds
[{"date":1711647600.155777,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[2024/03/28 17:40:01] [ info] [input] pausing dummy.0
[{"date":1711647601.156709,"severityNumber":9,"severityText":"Info","name":"logA","message":"This is a log message","app":"server","instance_num":1,"traceId":"08040201000000000000000000000000","spanId":"0102040800000000"}]
[2024/03/28 17:40:02] [ info] [engine] service has stopped (0 pending tasks)
[2024/03/28 17:40:02] [ info] [input] pausing dummy.0
[2024/03/28 17:40:02] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2024/03/28 17:40:02] [ info] [output:stdout:stdout.0] thread worker #0 stopped
==16653==
==16653== HEAP SUMMARY:
==16653==     in use at exit: 0 bytes in 0 blocks
==16653==   total heap usage: 2,053 allocs, 2,053 frees, 5,611,479 bytes allocated
==16653==
==16653== All heap blocks were freed -- no leaks are possible
==16653==
==16653== For lists of detected and suppressed errors, rerun with: -s
==16653== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@cb645j
Copy link
Contributor Author

cb645j commented Mar 28, 2024

Thank you @sudomateo

@cb645j
Copy link
Contributor Author

cb645j commented Mar 28, 2024

@edsiper @leonardo-albertovich @fujimotos @koleini

Can some please review this? Let me know if there is anything else I need to do as far as checks.

@patrick-stephens
Copy link
Contributor

Can you link a docs PR for this?

@cb645j
Copy link
Contributor Author

cb645j commented Mar 29, 2024

@patrick-stephens

Documentation PR. I also added docs for @edsiper change as it looks like no doc update was done for his new field.

fluent/fluent-bit-docs#1348

@patrick-stephens
Copy link
Contributor

Thanks, it looks like you have a few issues compiling for older targets as well. Loop definitions are the usual culprit but have a look.

You can build locally using ./packaging/build.sh -d centos/7.

@cb645j cb645j force-pushed the out_otel_suppoprt_body_keys branch from 02b1a89 to 240f096 Compare March 29, 2024 17:36
Signed-off-by: Boslet, Cory (cb645j) <[email protected]>
@cb645j
Copy link
Contributor Author

cb645j commented Mar 29, 2024

@patrick-stephens Thanks. I just moved it outside the loop. that should fix.

@cb645j
Copy link
Contributor Author

cb645j commented Apr 9, 2024

@patrick-stephens can you please rerun the checks

@cb645j
Copy link
Contributor Author

cb645j commented Apr 9, 2024

Thanks, it looks like you have a few issues compiling for older targets as well. Loop definitions are the usual culprit but have a look.

You can build locally using ./packaging/build.sh -d centos/7.

Done. Can you please rerun the checks

@edsiper
Copy link
Member

edsiper commented Apr 9, 2024

thanks for this contribution and general testing for all. Running CI, let's try to get this in for v3.0.2

@edsiper edsiper merged commit a39caf4 into fluent:master Apr 10, 2024
74 of 76 checks passed
@edsiper
Copy link
Member

edsiper commented Apr 10, 2024

thanks

@cb645j
Copy link
Contributor Author

cb645j commented Apr 10, 2024

thanks for this contribution and general testing for all. Running CI, let's try to get this in for v3.0.2

no problem, thanks

@AzureMarker
Copy link

Attributes (plan to add on separate pr)

@cb645j Do you still plan on implementing this for attributes?
I'm trying to set an attribute value on logs from a tail input, but the only way I see possible is migrating to YAML config files and using the content modifier processor:
https://docs.fluentbit.io/manual/pipeline/processors/content-modifier

@cb645j
Copy link
Contributor Author

cb645j commented Apr 26, 2024

Attributes (plan to add on separate pr)

@cb645j Do you still plan on implementing this for attributes? I'm trying to set an attribute value on logs from a tail input, but the only way I see possible is migrating to YAML config files and using the content modifier processor: https://docs.fluentbit.io/manual/pipeline/processors/content-modifier

Yes, I do. Its a little tricky but i will work on it. You can copy all fields to attributes by using @edsiper logs_body_key_attributes. The downside to it is that its either all or nothing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-required ok-package-test Run PR packaging tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants