Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stage.json and stage.logfmt behave differently when parsing fails #1268

Open
sgielen opened this issue Jul 11, 2024 · 2 comments
Open

stage.json and stage.logfmt behave differently when parsing fails #1268

sgielen opened this issue Jul 11, 2024 · 2 comments
Labels
bug Something isn't working needs-attention

Comments

@sgielen
Copy link

sgielen commented Jul 11, 2024

What's wrong?

I am trying to write a pipeline where messages with a log level of info,debug,trace are dropped, to reduce log volume. Some of these messages are logfmt and others are json. I started out with json:

        stage.json {
          expressions = {level = "level"}
        }

        stage.drop {
          source = "level"
          expression = ".*(info|INFO|debug|DEBUG|trace|TRACE).*"
          drop_counter_reason = "low_log_level"
        }

This is working as expected, dropping all JSON messages with the given level. Since I am not passing the drop_malformed parameters, any messages that are not JSON are passed as-is, as expected. Now, I tried to add logfmt:

        stage.logfmt {
          mapping = {level = "level"}
        }

This is now triggering the following error in my Grafana Agent logs:

ts=2024-07-11T17:30:08.945659061Z level=error msg="failed to decode logfmt" component_path=/ component_id=loki.process.pod_logs component=stage type=logfmt err="logfmt syntax error at pos 52 on line 1: unexpected '\"'"

I believe that stage.logfmt should behave the same as stage.json: just pass-through lines that are not in the expected format, unless drop_malformed is set.

Steps to reproduce

When forwarding logs that are logfmt, json or no particular format, add these extra stage blocks:

        stage.logfmt {
          mapping = {level = "level"}
        }

        stage.drop {
          source = "level"
          expression = ".*(info|INFO|debug|DEBUG|trace|TRACE).*"
          drop_counter_reason = "low_log_level"
        }

At this point, all level=info messages should be dropped, while e.g. level=error (as well as lines without a level) should be forwarded.

Instead, errors appear in the Alloy logs.

System information

Linux 5.4.0-148-generic x86_64 / Ubuntu 20.04.4 LTS

Software version

Grafana Alloy v1.2.0 as deployed by Helm chart k8s-monitoring v1.3.0

Configuration

// The Helm chart, including my additional stages, generated the following block:
loki.process "pod_logs" {
  stage.match {
    selector = "{tmp_container_runtime=\"containerd\"}"
    // the cri processing stage extracts the following k/v pairs: log, stream, time, flags
    stage.cri {}

    // Set the extract flags and stream values as labels
    stage.labels {
      values = {
        flags  = "",
        stream  = "",
      }
    }
  }

  stage.match {
    selector = "{tmp_container_runtime=\"cri-o\"}"
    // the cri processing stage extracts the following k/v pairs: log, stream, time, flags
    stage.cri {}

    // Set the extract flags and stream values as labels
    stage.labels {
      values = {
        flags  = "",
        stream  = "",
      }
    }
  }

  // if the label tmp_container_runtime from above is docker parse using docker
  stage.match {
    selector = "{tmp_container_runtime=\"docker\"}"
    // the docker processing stage extracts the following k/v pairs: log, stream, time
    stage.docker {}

    // Set the extract stream value as a label
    stage.labels {
      values = {
        stream  = "",
      }
    }
  }

  // Drop the filename label, since it's not really useful in the context of Kubernetes, where we already have
  // cluster, namespace, pod, and container labels.
  // Also drop the temporary container runtime label as it is no longer needed.
  stage.label_drop {
    values = ["filename", "tmp_container_runtime"]
  }
  stage.json {
    expressions = {level = "level"}
  }
  
  stage.drop {
    source = "level"
    expression = ".*(info|INFO|debug|DEBUG|trace|TRACE).*"
    drop_counter_reason = "low_log_level"
  }
  

  forward_to = [loki.process.logs_service.receiver]
}

Logs

ts=2024-07-11T17:30:08.945659061Z level=error msg="failed to decode logfmt" component_path=/ component_id=loki.process.pod_logs component=stage type=logfmt err="logfmt syntax error at pos 52 on line 1: unexpected '\"'"

// I unfortunately don't have the original log line that caused this, but I think any log line that isn't logfmt would cause the above error, for example:

{"time":"2024-07-11T17:40:57.638554268Z","level":"INFO","msg":"Exec","process":"auth","version":"r223","args":[],"commandTag":"QkVHSU4=","pid":315168,"sql":"begin isolation level repeatable read read only","time":444780,"module":"healthcheck"}
@sgielen sgielen added the bug Something isn't working label Jul 11, 2024
@sgielen
Copy link
Author

sgielen commented Jul 11, 2024

Probably related to this issue: grafana/loki#6066

Copy link
Contributor

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it.
If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue.
The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity.
Thank you for your contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-attention
Projects
None yet
Development

No branches or pull requests

1 participant