fix(proto): Remove error log when source_event_id is not present #21257

ArunPiduguDD · 2024-09-10T15:05:57Z

#21074 added a new field source_event_id which uniquely identifies an event as it passes through different components.

Initially made the assumption that source_event_id will always be present in proto, and if it is not present we should log an error. However, did not account for the cases such as when the source is a previous version of vector #21252, resulting in a noisy error logs. Removing this log as well as making source_event_id Optional in EventMetadata

lib/vector-core/src/event/proto.rs

jszwedko · 2024-09-10T15:34:47Z

lib/vector-core/src/event/proto.rs

@@ -677,8 +677,6 @@ impl From<Metadata> for EventMetadata {

        if let Ok(uuid) = Uuid::from_slice(&value.source_event_id) {


Should source_event_id have been an option like the other fields here so that we can detect when it is present?

Yea makes sense, can convert to optional

EDIT: Following up from offline discussion, will keep the proto definition as is and change the value in EventMetadata to optional

datadog-vectordotdev · 2024-09-10T15:36:40Z

Datadog Report

Branch report: remove_error_log_absent_source_event_id
Commit report: 418bcbd
Test service: vector

✅ 0 Failed, 7 Passed, 0 Skipped, 25.49s Total Time

…mpty field

lib/vector-core/proto/event.proto

…al in EventMetadata

lib/vector-core/src/event/proto.rs

pront

Thanks, this was a tricky one!

github-actions · 2024-09-10T20:39:17Z

Regression Detector Results

Run ID: 4d606864-f607-4779-9e8a-a2d0f429b5f5 Metrics dashboard

Baseline: f94c28c
Comparison: fd87fb6

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf	experiment	goal	Δ mean %	Δ mean % CI	links
✅	file_to_blackhole	egress throughput	+16.30	[+8.77, +23.82]

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI
✅	file_to_blackhole	egress throughput	+16.30	[+8.77, +23.82]
➖	otlp_http_to_blackhole	ingress throughput	+4.20	[+4.08, +4.33]
➖	socket_to_socket_blackhole	ingress throughput	+3.83	[+3.77, +3.89]
➖	syslog_regex_logs2metric_ddmetrics	ingress throughput	+3.63	[+3.47, +3.80]
➖	http_to_http_acks	ingress throughput	+2.73	[+1.46, +3.99]
➖	splunk_hec_route_s3	ingress throughput	+2.24	[+1.93, +2.55]
➖	syslog_loki	ingress throughput	+2.07	[+1.98, +2.16]
➖	syslog_log2metric_splunk_hec_metrics	ingress throughput	+1.37	[+1.27, +1.48]
➖	datadog_agent_remap_datadog_logs_acks	ingress throughput	+0.84	[+0.66, +1.02]
➖	otlp_grpc_to_blackhole	ingress throughput	+0.63	[+0.52, +0.74]
➖	fluent_elasticsearch	ingress throughput	+0.36	[-0.13, +0.85]
➖	http_elasticsearch	ingress throughput	+0.18	[+0.01, +0.35]
➖	http_text_to_http_json	ingress throughput	+0.18	[+0.06, +0.30]
➖	http_to_http_noack	ingress throughput	+0.17	[+0.08, +0.25]
➖	http_to_s3	ingress throughput	+0.13	[-0.14, +0.41]
➖	syslog_log2metric_tag_cardinality_limit_blackhole	ingress throughput	+0.06	[-0.04, +0.15]
➖	splunk_hec_indexer_ack_blackhole	ingress throughput	+0.04	[-0.04, +0.12]
➖	http_to_http_json	ingress throughput	+0.03	[-0.00, +0.07]
➖	splunk_hec_to_splunk_hec_logs_noack	ingress throughput	+0.02	[-0.07, +0.11]
➖	splunk_hec_to_splunk_hec_logs_acks	ingress throughput	-0.00	[-0.11, +0.10]
➖	datadog_agent_remap_blackhole_acks	ingress throughput	-0.17	[-0.27, -0.06]
➖	datadog_agent_remap_blackhole	ingress throughput	-0.39	[-0.51, -0.27]
➖	syslog_humio_logs	ingress throughput	-0.41	[-0.54, -0.29]
➖	syslog_splunk_hec_logs	ingress throughput	-0.88	[-1.00, -0.77]
➖	datadog_agent_remap_datadog_logs	ingress throughput	-1.46	[-1.68, -1.24]
➖	syslog_log2metric_humio_metrics	ingress throughput	-2.00	[-2.12, -1.88]

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

github-actions · 2024-09-10T20:49:16Z

Regression Detector Results

Run ID: e0014fed-c091-4ab0-8b01-241c53006c62 Metrics dashboard

Baseline: 81fa4e8
Comparison: 70fc515

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf	experiment	goal	Δ mean %	Δ mean % CI	links
➖	file_to_blackhole	egress throughput	+1.72	[-4.85, +8.30]

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI
➖	socket_to_socket_blackhole	ingress throughput	+4.53	[+4.46, +4.59]
➖	otlp_http_to_blackhole	ingress throughput	+4.11	[+3.96, +4.26]
➖	http_to_http_acks	ingress throughput	+2.74	[+1.48, +4.00]
➖	fluent_elasticsearch	ingress throughput	+2.51	[+2.01, +3.00]
➖	syslog_log2metric_splunk_hec_metrics	ingress throughput	+2.33	[+2.24, +2.43]
➖	file_to_blackhole	egress throughput	+1.72	[-4.85, +8.30]
➖	syslog_humio_logs	ingress throughput	+1.36	[+1.24, +1.48]
➖	syslog_loki	ingress throughput	+1.25	[+1.16, +1.33]
➖	splunk_hec_route_s3	ingress throughput	+1.14	[+0.83, +1.45]
➖	otlp_grpc_to_blackhole	ingress throughput	+0.57	[+0.46, +0.67]
➖	syslog_log2metric_humio_metrics	ingress throughput	+0.31	[+0.18, +0.44]
➖	datadog_agent_remap_blackhole_acks	ingress throughput	+0.27	[+0.16, +0.38]
➖	http_to_http_noack	ingress throughput	+0.12	[+0.05, +0.19]
➖	http_to_http_json	ingress throughput	+0.03	[-0.02, +0.09]
➖	splunk_hec_to_splunk_hec_logs_noack	ingress throughput	-0.00	[-0.10, +0.09]
➖	splunk_hec_indexer_ack_blackhole	ingress throughput	-0.00	[-0.09, +0.08]
➖	splunk_hec_to_splunk_hec_logs_acks	ingress throughput	-0.01	[-0.12, +0.11]
➖	http_to_s3	ingress throughput	-0.07	[-0.34, +0.20]
➖	datadog_agent_remap_datadog_logs_acks	ingress throughput	-0.25	[-0.38, -0.11]
➖	http_text_to_http_json	ingress throughput	-0.33	[-0.47, -0.20]
➖	syslog_splunk_hec_logs	ingress throughput	-0.35	[-0.44, -0.27]
➖	http_elasticsearch	ingress throughput	-0.36	[-0.51, -0.21]
➖	syslog_log2metric_tag_cardinality_limit_blackhole	ingress throughput	-0.48	[-0.56, -0.40]
➖	datadog_agent_remap_datadog_logs	ingress throughput	-1.03	[-1.22, -0.84]
➖	datadog_agent_remap_blackhole	ingress throughput	-1.34	[-1.44, -1.23]
➖	syslog_regex_logs2metric_ddmetrics	ingress throughput	-2.01	[-2.18, -1.83]

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

) * remove error log * make source_event_id optional + add error log when failing to parse empty field * update tag number * undo proto changes, remove error log, and make source_event_id optional in EventMetadata * cargo fmt * fix test + re-add error logic for certain cases

remove error log

ea3f679

ArunPiduguDD requested a review from a team as a code owner September 10, 2024 15:05

ArunPiduguDD requested review from pront and jszwedko September 10, 2024 15:06

github-actions bot added the domain: core Anything related to core crates i.e. vector-core, core-common, etc label Sep 10, 2024

ArunPiduguDD requested a review from bruceg September 10, 2024 15:06

ArunPiduguDD added the no-changelog Changes in this PR do not need user-facing explanations in the release changelog label Sep 10, 2024

pront approved these changes Sep 10, 2024

View reviewed changes

bruceg reviewed Sep 10, 2024

View reviewed changes

lib/vector-core/src/event/proto.rs Show resolved Hide resolved

jszwedko reviewed Sep 10, 2024

View reviewed changes

make source_event_id optional + add error log when failing to parse e…

997df36

…mpty field

ArunPiduguDD requested review from pront and bruceg September 10, 2024 16:35

update tag number

b9831c0

ArunPiduguDD requested a review from jszwedko September 10, 2024 17:02

pront reviewed Sep 10, 2024

View reviewed changes

lib/vector-core/proto/event.proto Outdated Show resolved Hide resolved

pront self-requested a review September 10, 2024 17:51

ArunPiduguDD added 2 commits September 10, 2024 14:01

undo proto changes, remove error log, and make source_event_id option…

3471d9b

…al in EventMetadata

cargo fmt

9563290

jszwedko reviewed Sep 10, 2024

View reviewed changes

lib/vector-core/src/event/proto.rs Outdated Show resolved Hide resolved

fix test + re-add error logic for certain cases

bdd82f7

pront approved these changes Sep 10, 2024

View reviewed changes

ArunPiduguDD enabled auto-merge September 10, 2024 19:22

ArunPiduguDD added this pull request to the merge queue Sep 10, 2024

ArunPiduguDD removed this pull request from the merge queue due to a manual request Sep 10, 2024

jszwedko approved these changes Sep 10, 2024

View reviewed changes

ArunPiduguDD added this pull request to the merge queue Sep 10, 2024

Merged via the queue into master with commit 70fc515 Sep 10, 2024
80 checks passed

ArunPiduguDD deleted the remove_error_log_absent_source_event_id branch September 10, 2024 21:10

jszwedko mentioned this pull request Sep 10, 2024

Vector source: Invalid source_event_id in metadata #21252

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(proto): Remove error log when source_event_id is not present #21257

fix(proto): Remove error log when source_event_id is not present #21257

ArunPiduguDD commented Sep 10, 2024 •

edited

Loading

jszwedko Sep 10, 2024 •

edited

Loading

ArunPiduguDD Sep 10, 2024 •

edited

Loading

datadog-vectordotdev bot commented Sep 10, 2024 •

edited

Loading

pront left a comment

github-actions bot commented Sep 10, 2024

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

github-actions bot commented Sep 10, 2024

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

		@@ -677,8 +677,6 @@ impl From<Metadata> for EventMetadata {

		if let Ok(uuid) = Uuid::from_slice(&value.source_event_id) {

fix(proto): Remove error log when source_event_id is not present #21257

fix(proto): Remove error log when source_event_id is not present #21257

Conversation

ArunPiduguDD commented Sep 10, 2024 • edited Loading

jszwedko Sep 10, 2024 • edited Loading

Choose a reason for hiding this comment

ArunPiduguDD Sep 10, 2024 • edited Loading

Choose a reason for hiding this comment

datadog-vectordotdev bot commented Sep 10, 2024 • edited Loading

Datadog Report

pront left a comment

Choose a reason for hiding this comment

github-actions bot commented Sep 10, 2024

Regression Detector Results

No significant changes in experiment optimization goals

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

github-actions bot commented Sep 10, 2024

Regression Detector Results

No significant changes in experiment optimization goals

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

ArunPiduguDD commented Sep 10, 2024 •

edited

Loading

jszwedko Sep 10, 2024 •

edited

Loading

ArunPiduguDD Sep 10, 2024 •

edited

Loading

datadog-vectordotdev bot commented Sep 10, 2024 •

edited

Loading