-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support finding JSON after the line start for otlpjsonfilereceiver #33846
Conversation
updated the pr so that it doesn't require any config |
component: otlpjsonfilereceiver | ||
|
||
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). | ||
note: Add support finding JSON after the line start for otlpjsonfilereceiver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn’t seem to be what this change is about
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert this change
@@ -35,60 +36,6 @@ func TestDefaultConfig(t *testing.T) { | |||
require.NoError(t, componenttest.CheckConfigStruct(cfg)) | |||
} | |||
|
|||
func TestFileTracesReceiver(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you are deleting this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test is incorporated into TestFileReceiver
as a parameterized test
I don’t understand the rationale for this change. I am also not sure why the java format is not compliant with the spec. |
In k8s, the log payload is packaged - see https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/container.md The docker json encoding is not supported yet - I'll add that.
it could be made compliant I guess, but a more lenient approach can support more existing implementations without any drawback - that's at least the idea. |
Since #33912 is affected by this one I would like to ensure that I understand the use-case. From what I can see in the unit-test the target "logs" would be read from the disk and are wrapped in a container format. So it would be something like
If that's correct, then the container format should be handled first using the
In that case we would have the following: receivers:
filelog:
include:
- /var/log/pods/*/*/*.log
operators:
- id: container-parser
type: container
- type: json_parser
if: 'body matches "^{.*}$"'
.....# add any additional settings Alternatively, the additional json parsing could be performed in a This looks to be the "natural" approach for unwrapping this type of content so to this current use-case I think the container format handling should happen first and then forward the body/content to any kind of further processing. Let me know if I miss anything here:) |
Thanks @jpkrohling for helping me understand how this would work Here's a sample config of using a connector: connectors:
otlpjson:
processors:
receivers:
filelogreceiver:
exporters:
otlp:
service:
pipeline:
logs/raw:
receivers:
- filelog # raw log entry
exporters:
- otlpjson # converts raw into otlp json
traces/otlp:
receivers:
- otlpjson # otlp json is visible -> converts to pdata
exporters:
- otlp
metrics/otlp:
receivers:
- otlpjson
exporters:
- otlp
logs/otlp:
receivers:
- otlpjson
exporters:
- otlp wdyt? |
The problem is that the otlpjson represents an entire This is why in the discussion yesterday it was suggested that a new processor (i.e. So, using the same example:
The following could be used to read and "unwrap" the otlpjson: receivers:
filelog:
include:
- /var/log/pods/*/*/*.log
operators:
- id: container-parser
type: container This gives you a Notably, all of the information in the container wrapper would be discarded. The container parser populates some of these fields when unwrapping the otlpjson, but it is not meaningful when unmarshaling the otlpjson into a Also, the current Edit: I'm describing the functionality with some assumptions, but additional configuration in the connector could be useful. For example, maybe we want to offer config options to keep attributes from the parent |
@djaglowski sounds like you describe the same as I did in solution outline #33846 (comment) Would this work? |
Yes, I think we're aligned. I'm willing to sponsor the new connector if you'll write up a new issue for it. |
great - I just need to find someone to implement it 😄 |
Verifying some details for my own benefit. Apologies if that's trivial. Is this correct that the potential From the provided description's example
the first part should be skipped (by the filelog receiver most probably) and only consider as otlp json the part after the {"resource":{"attributes":[{"key":"container.id","value":{"stringValue":"075da3b7317ceccd6df58562684a8092040aacca8b5b0c49eacb33f1d2fe15b9"}},{"key":"deployment.environment","value":{"stringValue":"staging"}},{"key":"host.arch","value":{"stringValue":"amd64"}},{"key":"host.name","value":{"stringValue":"anti-fraud-7b498c4dcb-f5wqj"}},{"key":"os.description","value":{"stringValue":"Linux 6.5.0-41-generic"}},{"key":"os.type","value":{"stringValue":"linux"}},{"key":"process.command_args","value":{"arrayValue":{"values":[{"stringValue":"/opt/java/openjdk/bin/java"},{"stringValue":"-jar"},{"stringValue":"./app.jar"}]}}},{"key":"process.executable.path","value":{"stringValue":"/opt/java/openjdk/bin/java"}},{"key":"process.pid","value":{"intValue":"1"}},{"key":"process.runtime.description","value":{"stringValue":"Eclipse Adoptium OpenJDK 64-Bit Server VM 21.0.3+9-LTS"}},{"key":"process.runtime.name","value":{"stringValue":"OpenJDK Runtime Environment"}},{"key":"process.runtime.version","value":{"stringValue":"21.0.3+9-LTS"}},{"key":"service.instance.id","value":{"stringValue":"7e31966b-5668-4338-913a-5e2601d75e25"}},{"key":"service.name","value":{"stringValue":"anti-fraud"}},{"key":"service.namespace","value":{"stringValue":"shop"}},{"key":"service.version","value":{"stringValue":"1.1"}},{"key":"telemetry.distro.name","value":{"stringValue":"grafana-opentelemetry-java"}},{"key":"telemetry.distro.version","value":{"stringValue":"2.4.0-beta.1"}},{"key":"telemetry.sdk.language","value":{"stringValue":"java"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.38.0"}}]},"scopeLogs":[{"scope":{"name":"com.mycompany.antifraud.FraudDetectionController","attributes":[]},"logRecords":[{"timeUnixNano":"1719915066488000000","observedTimeUnixNano":"1719915066488267425","severityNumber":13,"severityText":"WARN","body":{"stringValue":"checkOrder(totalPrice=300, shippingCountry=, customerIpAddress=127.0.0.1) fraudScore=15, status=REJECTED"},"attributes":[{"key":"thread.id","value":{"intValue":"44"}},{"key":"thread.name","value":{"stringValue":"http-nio-8080-exec-1"}}],"flags":1,"traceId":"336f93f9f72b9fec3e4e01e38cb6a99c","spanId":"de97c85b1ee0669a"}]}]} One note here that even that part cannot be parsed by the
? Just wanted to verify this because trying out the original example brought me some confusion. ps: Happy to help with the implementation if you are still looking for someone. |
I didn't check that the original format was valid otlpjson, but I agree that the connector should accept otlpjson only, or at least by default. If there are other well established text formats then maybe those can be supported later too but then I think we're talking about a more generalized connector. |
I now see that the So correct me if I'm wrong here here but the issue is that even if we parse out the @zeitlinger if you could provide more details about the specific use-case that would be helpful. edit: I have worked on something simple to illustrate the point for this connector: https://github.com/ChrsMark/otlpjsonconnector. If we agree to ship this component, I'm happy to take it through the proper "New Component" process and make it part of the contrib repo. |
yes, that's the idea |
correct - so that would have to be fixed in a previous operator (maybe regex with an additional feature to reference capture groups) OR the a change to the java app
basically https://opentelemetry.io/docs/specs/otel/protocol/file-exporter/
that would be awesome 😄 |
Thank's for clarifying @zeitlinger! Just for the records (and in case we need this for further testing), this kind of logs can be produced by the 2024-07-19T10:19:10.666154668Z stderr F [otel.javaagent 2024-07-19 10:19:10:665 +0000] [BatchLogRecordProcessor_WorkerThread-1] INFO io.opentelemetry.exporter.logging.otlp.OtlpJsonLoggingLogRecordExporter - {"resource":{"attributes":[{"key":"container.id","value":{"stringValue":"74f29844c933d5844860485a10c830d3a0bd26b4493bd8f0f07fbe6238e8f0b6"}},{"key":"host.arch","value":{"stringValue":"amd64"}},{"key":"host.name","value":{"stringValue":"my-otel-demo-adservice-5c5f6df74b-bjvr5"}},{"key":"os.description","value":{"stringValue":"Linux 5.15.0-113-generic"}},{"key":"os.type","value":{"stringValue":"linux"}},{"key":"process.command_line","value":{"stringValue":"/opt/java/openjdk/bin/java -javaagent:/usr/src/app/opentelemetry-javaagent.jar oteldemo.AdService"}},{"key":"process.executable.path","value":{"stringValue":"/opt/java/openjdk/bin/java"}},{"key":"process.pid","value":{"intValue":"1"}},{"key":"process.runtime.description","value":{"stringValue":"Eclipse Adoptium OpenJDK 64-Bit Server VM 21.0.3+9-LTS"}},{"key":"process.runtime.name","value":{"stringValue":"OpenJDK Runtime Environment"}},{"key":"process.runtime.version","value":{"stringValue":"21.0.3+9-LTS"}},{"key":"service.instance.id","value":{"stringValue":"80efb175-53a0-4bc4-b3f2-60bbaf0e2713"}},{"key":"service.name","value":{"stringValue":"adservice"}},{"key":"service.namespace","value":{"stringValue":"opentelemetry-demo"}},{"key":"service.version","value":{"stringValue":"1.11.0"}},{"key":"telemetry.distro.name","value":{"stringValue":"elastic"}},{"key":"telemetry.distro.version","value":{"stringValue":"0.4.0"}},{"key":"telemetry.sdk.language","value":{"stringValue":"java"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.38.0"}}]},"scopeLogs":[{"scope":{"name":"oteldemo.AdService","attributes":[]},"logRecords":[{"timeUnixNano":"1721384350240315569","observedTimeUnixNano":"1721384350240331398","severityNumber":9,"severityText":"INFO","body":{"stringValue":"Targeted ad request received for [binoculars]"},"attributes":[],"flags":1,"traceId":"951843d689a86e3336ea6b872516c1ad","spanId":"548686bd37a34d7f"}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.24.0"}
I'm not sure if |
I think it should - or a new exporter should. I can take care of figuring this out. |
|
@zeitlinger using the new connector and having some Using the following config: receivers:
filelog:
include:
- /var/log/pods/prod_my-target-pod_49cc7c1fd3702c40b2686ea7486091d3/my-target-pod/1.log
include_file_path: true
operators:
- id: container-parser
type: container
exporters:
debug:
verbosity: detailed
processors:
transform:
log_statements:
- context: log
statements:
- merge_maps(cache,ExtractPatterns(body,"io.opentelemetry.exporter.logging.otlp.OtlpJsonLoggingLogRecordExporter - (?P<log>.*)"), "upsert") where body != nil
- set(body,cache["log"])
- merge_maps(cache,ParseJSON(body), "upsert") where body!= nil
- delete_key(cache, "schemaUrl")
- set(body,Concat(["{\"resourceLogs\":[",cache,"]}"], ""))
connectors:
otlpjson:
service:
pipelines:
logs/raw:
receivers: [filelog]
processors: [transform]
exporters: [otlpjson]
metrics/otlp:
receivers: [ otlpjson ]
exporters: [ debug ]
logs/otlp:
receivers: [ otlpjson ]
exporters: [ debug ]
traces/otlp:
receivers: [ otlpjson ]
exporters: [ debug ] Then write some sample logs in container format: echo '2024-07-19T10:19:10.666154668Z stderr F [otel.javaagent 2024-07-19 10:19:10:665 +0000] [BatchLogRecordProcessor_WorkerThread-1] INFO io.opentelemetry.exporter.logging.otlp.OtlpJsonLoggingLogRecordExporter - {"resource":{"attributes":[{"key":"container.id","value":{"stringValue":"74f29844c933d5844860485a10c830d3a0bd26b4493bd8f0f07fbe6238e8f0b6"}},{"key":"host.arch","value":{"stringValue":"amd64"}},{"key":"host.name","value":{"stringValue":"my-otel-demo-adservice-5c5f6df74b-bjvr5"}},{"key":"os.description","value":{"stringValue":"Linux 5.15.0-113-generic"}},{"key":"os.type","value":{"stringValue":"linux"}},{"key":"process.command_line","value":{"stringValue":"/opt/java/openjdk/bin/java -javaagent:/usr/src/app/opentelemetry-javaagent.jar oteldemo.AdService"}},{"key":"process.executable.path","value":{"stringValue":"/opt/java/openjdk/bin/java"}},{"key":"process.pid","value":{"intValue":"1"}},{"key":"process.runtime.description","value":{"stringValue":"Eclipse Adoptium OpenJDK 64-Bit Server VM 21.0.3+9-LTS"}},{"key":"process.runtime.name","value":{"stringValue":"OpenJDK Runtime Environment"}},{"key":"process.runtime.version","value":{"stringValue":"21.0.3+9-LTS"}},{"key":"service.instance.id","value":{"stringValue":"80efb175-53a0-4bc4-b3f2-60bbaf0e2713"}},{"key":"service.name","value":{"stringValue":"adservice"}},{"key":"service.namespace","value":{"stringValue":"opentelemetry-demo"}},{"key":"service.version","value":{"stringValue":"1.11.0"}},{"key":"telemetry.distro.name","value":{"stringValue":"elastic"}},{"key":"telemetry.distro.version","value":{"stringValue":"0.4.0"}},{"key":"telemetry.sdk.language","value":{"stringValue":"java"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.38.0"}}]},"scopeLogs":[{"scope":{"name":"oteldemo.AdService","attributes":[]},"logRecords":[{"timeUnixNano":"1721384350240315569","observedTimeUnixNano":"1721384350240331398","severityNumber":9,"severityText":"INFO","body":{"stringValue":"Targeted ad request received for [binoculars]"},"attributes":[],"flags":1,"traceId":"951843d689a86e3336ea6b872516c1ad","spanId":"548686bd37a34d7f"}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.24.0"}' >> /var/log/pods/prod_my-target-pod_49cc7c1fd3702c40b2686ea7486091d3/my-target-pod/1.log I see: 2024-07-26T16:08:39.327+0300 info ResourceLog #0
Resource SchemaURL:
Resource attributes:
-> container.id: Str(74f29844c933d5844860485a10c830d3a0bd26b4493bd8f0f07fbe6238e8f0b6)
-> host.arch: Str(amd64)
-> host.name: Str(my-otel-demo-adservice-5c5f6df74b-bjvr5)
-> os.description: Str(Linux 5.15.0-113-generic)
-> os.type: Str(linux)
-> process.command_line: Str(/opt/java/openjdk/bin/java -javaagent:/usr/src/app/opentelemetry-javaagent.jar oteldemo.AdService)
-> process.executable.path: Str(/opt/java/openjdk/bin/java)
-> process.pid: Int(1)
-> process.runtime.description: Str(Eclipse Adoptium OpenJDK 64-Bit Server VM 21.0.3+9-LTS)
-> process.runtime.name: Str(OpenJDK Runtime Environment)
-> process.runtime.version: Str(21.0.3+9-LTS)
-> service.instance.id: Str(80efb175-53a0-4bc4-b3f2-60bbaf0e2713)
-> service.name: Str(adservice)
-> service.namespace: Str(opentelemetry-demo)
-> service.version: Str(1.11.0)
-> telemetry.distro.name: Str(elastic)
-> telemetry.distro.version: Str(0.4.0)
-> telemetry.sdk.language: Str(java)
-> telemetry.sdk.name: Str(opentelemetry)
-> telemetry.sdk.version: Str(1.38.0)
ScopeLogs #0
ScopeLogs SchemaURL:
InstrumentationScope oteldemo.AdService
LogRecord #0
ObservedTimestamp: 2024-07-19 10:19:10.240331398 +0000 UTC
Timestamp: 2024-07-19 10:19:10.240315569 +0000 UTC
SeverityText: INFO
SeverityNumber: Info(9)
Body: Str(Targeted ad request received for [binoculars])
Trace ID: 951843d689a86e3336ea6b872516c1ad
Span ID: 548686bd37a34d7f
Flags: 1
{"kind": "exporter", "data_type": "logs", "name": "debug" There are might be corner cases to handle through configuration, specially for the ottl/transform part, but the point is that the specific case can now be supported. |
@ChrsMark great - I didn't know that this is possible for reference, I've added a working example here that also ignores other lines: https://github.com/zeitlinger/otelcol-cookbook/tree/main/otlp-json |
Description:
Add support finding JSON after the line start for otlpjsonfilereceiver
Example OTLP/JSON produced by the produced by the OTel Java Agent using https://opentelemetry.io/docs/languages/java/configuration/#logging-otlp-json-exporter
Testing:
unit tests
Documentation:
not needed - it just works in more cases, e.g. with OTel Java Agent