[APR-248] rearrange things to make sure we intern metadata strings #337

lukesteensen · 2024-11-19T02:26:34Z

This is a bit rough around the edges still, but opening to get feedback. Things that are currently awkward:

We test metric metadata in the deserializer (parser) tests, but there are fields somewhat outside the scope of the parser (imo) like process_id, origin, etc. This means we're either making up data to fill the tests or duplicating logic from the source.
Origin is an optional field in the metadata but we set it unconditionally (this seems like it was for convenience in tests)
A few metadata fields were using empty strings to signal None (maybe an optimization? It can be put back).
Some of the builder-style APIs on metadata (e.g. set_container_id) very effectively hid places where we should have been interning strings via Into<MetaString>, which is a bit of a performance footgun (arguably correctness for bounding as well).
I probably overcorrected with pub fields, but it was the easiest way to get this working for now.

None of this should be terribly difficult to sort out, and there's room for some nice simplification once we do, but wanted to make sure folks are on the same page before making any real design decisions.

pr-commenter · 2024-11-19T02:35:13Z

Regression Detector (DogStatsD)

Regression Detector Results

Run ID: 2e655b81-0bd2-4256-bd38-7b6830f3d22a

Baseline: 7.59.0
Comparison: 7.59.0

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials
➖	dsd_uds_100mb_3k_contexts_distributions_only	memory utilization	+0.69	[+0.51, +0.87]	1
➖	dsd_uds_500mb_3k_contexts	ingress throughput	+0.03	[+0.02, +0.05]	1
➖	dsd_uds_10mb_3k_contexts	ingress throughput	+0.01	[-0.00, +0.03]	1
➖	dsd_uds_1mb_50k_contexts_memlimit	ingress throughput	+0.00	[-0.00, +0.00]	1
➖	dsd_uds_1mb_3k_contexts_dualship	ingress throughput	+0.00	[-0.00, +0.00]	1
➖	dsd_uds_100mb_250k_contexts	ingress throughput	+0.00	[-0.00, +0.00]	1
➖	dsd_uds_1mb_50k_contexts	ingress throughput	-0.00	[-0.00, +0.00]	1
➖	dsd_uds_512kb_3k_contexts	ingress throughput	-0.00	[-0.01, +0.01]	1
➖	dsd_uds_1mb_3k_contexts	ingress throughput	-0.00	[-0.00, +0.00]	1
➖	dsd_uds_100mb_3k_contexts	ingress throughput	-0.00	[-0.05, +0.04]	1
➖	quality_gates_idle_rss	memory utilization	-1.33	[-1.46, -1.21]	1

Bounds Checks: ❌ Failed

perf	experiment	bounds_check_name	replicates_passed	links
❌	quality_gates_idle_rss	memory_usage	0/10

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

pr-commenter · 2024-11-19T02:42:11Z

Regression Detector (Saluki)

Regression Detector Results

Run ID: 738c41c5-81e6-4240-b7f2-b6b756f7abdf

Baseline: ec30b3d
Comparison: b4c096c
Diff

Optimization Goals: ❌ Significant changes detected

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
✅	quality_gates_idle_rss	memory utilization	-19.33	[-19.69, -18.96]	1

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials
➖	dsd_uds_50mb_10k_contexts_no_inlining	ingress throughput	+0.01	[-0.07, +0.08]	1
➖	dsd_uds_512kb_3k_contexts	ingress throughput	+0.00	[-0.01, +0.01]	1
➖	dsd_uds_1mb_50k_contexts	ingress throughput	+0.00	[-0.02, +0.02]	1
➖	dsd_uds_50mb_10k_contexts_no_inlining_no_allocs	ingress throughput	-0.00	[-0.06, +0.06]	1
➖	dsd_uds_100mb_3k_contexts	ingress throughput	-0.00	[-0.06, +0.05]	1
➖	dsd_uds_1mb_3k_contexts_dualship	ingress throughput	-0.00	[-0.01, +0.00]	1
➖	dsd_uds_100mb_250k_contexts	ingress throughput	-0.01	[-0.04, +0.02]	1
➖	dsd_uds_1mb_3k_contexts	ingress throughput	-0.02	[-0.03, +0.00]	1
➖	dsd_uds_10mb_3k_contexts	ingress throughput	-0.03	[-0.06, +0.01]	1
➖	dsd_uds_500mb_3k_contexts	ingress throughput	-0.25	[-0.36, -0.14]	1
➖	dsd_uds_1mb_50k_contexts_memlimit	ingress throughput	-2.57	[-4.41, -0.73]	1
➖	dsd_uds_100mb_3k_contexts_distributions_only	memory utilization	-4.17	[-4.43, -3.90]	1
✅	quality_gates_idle_rss	memory utilization	-19.33	[-19.69, -18.96]	1

Bounds Checks: ✅ Passed

perf	experiment	bounds_check_name	replicates_passed	links
✅	quality_gates_idle_rss	memory_usage	10/10

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

pr-commenter · 2024-11-19T02:42:44Z

Regression Detector Links

Experiment Result Links

experiment	link(s)
dsd_uds_100mb_250k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts_distributions_only	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_10mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_3k_contexts_dualship	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts_memlimit	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_500mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_512kb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
quality_gates_idle_rss	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_50mb_10k_contexts_no_inlining (ADP only)	[Profiling (ADP)] [SMP Dashboard]
dsd_uds_50mb_10k_contexts_no_inlining_no_allocs (ADP only)	[Profiling (ADP)] [SMP Dashboard]

tobz · 2024-11-19T17:51:05Z

lib/saluki-context/src/tags.rs

        self.0.split_once(':').map(|(_, value)| value)
    }

    /// Gets the name and value of the tag.
    ///
    /// For bare tags (e.g. `production`), this always returns `(Some(...), None)`.
-    pub fn name_and_value(&self) -> (Option<&str>, Option<&str>) {
+    pub fn name_and_value(&self) -> (&'a str, Option<&'a str>) {


tobz · 2024-11-19T18:03:02Z

We test metric metadata in the deserializer (parser) tests, but there are fields somewhat outside the scope of the parser (imo) like process_id, origin, etc. This means we're either making up data to fill the tests or duplicating logic from the source.

I agree that the parser really shouldn't care except for knowing how to adjust the raw input payload in order to exercise those codepaths. I do think the tests are valuable overall: for example, we have all the various permutations because of a real bug I caught where I broke how optional fields are parsed which didn't show up unless the optional fields were in a certain order.

I'm ambivalent to where you place the tests, but it does feel like we need to keep them in spirit to avoid regressing the codec.

Origin is an optional field in the metadata but we set it unconditionally (this seems like it was for convenience in tests)

I'm guessing I made this optional trying to futureproof for a scenario where we weren't deal with Agent metrics. I'm fine with making it required until that need actually arises.

A few metadata fields were using empty strings to signal None (maybe an optimization? It can be put back).

This was a hack to to have Option<T> semantics without the size increase of actually having the fields be Option<MetaString>. I'm not strongly tied to having accessors that return Option<&str>, though, so fine to eliminate these for now.

Some of the builder-style APIs on metadata (e.g. set_container_id) very effectively hid places where we should have been interning strings via Into<MetaString>, which is a bit of a performance footgun (arguably correctness for bounding as well).

I probably overcorrected with pub fields, but it was the easiest way to get this working for now.

This is a good point. My brain personally enjoys builder APIs, but I'm not strongly tied to them. A middleground could be that we just make them more restricted by requiring MetaString instead of transparently converting them... but again, I'm not strongly tied to the builderAPI.

My inclination would be to have a builder API if we do end up with a good number of optional fields to avoid massive, multi-line struct initializers... that's it, really.

Overall, this approach seems fine to me. 👍🏻

Signed-off-by: Luke Steensen <[email protected]>

lukesteensen · 2024-12-02T17:14:13Z

@tobz

I do think the tests are valuable overall: for example, we have all the various permutations because of a real bug I caught where I broke how optional fields are parsed which didn't show up unless the optional fields were in a certain order.

I'm ambivalent to where you place the tests, but it does feel like we need to keep them in spirit to avoid regressing the codec.

Strongly agree! I wouldn't get rid of any of the tests, was just trying to figure out a good way to split out the dirty external stuff (i.e. connection address) from the pure parsing concerns and organize the tests accordingly. But that's something we can maybe follow up on.

I'm guessing I made this optional trying to futureproof for a scenario where we weren't deal with Agent metrics. I'm fine with making it required until that need actually arises.

Not a big deal, this makes sense.

This was a hack to to have Option semantics without the size increase of actually having the fields be Option. I'm not strongly tied to having accessors that return Option<&str>, though, so fine to eliminate these for now.

Noticed this regression and backed out the change.

This is a good point. My brain personally enjoys builder APIs, but I'm not strongly tied to them. A middleground could be that we just make them more restricted by requiring MetaString instead of transparently converting them... but again, I'm not strongly tied to the builder API.

My inclination would be to have a builder API if we do end up with a good number of optional fields to avoid massive, multi-line struct initializers... that's it, really.

I'm good with that too, I just think the current implementation via setters/getters ends up encouraging/allowing mutation in a way that's limiting for sharing metadata, for example. Just having separate builder structs would fix that.

tobz

Looks good to me.

I think what we might want to consider in the future is simply changing the DSD codec tests to deal with MetricPacket<'a> instead of Metric when parsing.

As nice as it would be to just have codecs only return Event/Event-compatible types, that's clearly not the way to approach things when we need to obsess over performance/efficiency, so we can probably afford to be more sparse in the tests just to avoid polluting them with too much setup/boilerplate.

Nothing you need to do here unless you really want to.

lukesteensen requested review from a team as code owners November 19, 2024 02:26

github-actions bot added area/core Core functionality, event model, etc. area/io General I/O and networking. area/components Sources, transforms, and destinations. source/dogstatsd DogStatsD source. labels Nov 19, 2024

tobz reviewed Nov 19, 2024

View reviewed changes

rearrange things to make sure we intern metadata strings

92668d0

Signed-off-by: Luke Steensen <[email protected]>

lukesteensen force-pushed the intern-metadata-strings branch from 6bf270a to 92668d0 Compare November 19, 2024 18:21

lukesteensen added 2 commits November 19, 2024 16:55

bring back sentinel empty strings for size optimization

7e1f087

Signed-off-by: Luke Steensen <[email protected]>

try bumping the string interner size for heavy smp scenario

39e5bea

Signed-off-by: Luke Steensen <[email protected]>

github-actions bot added the area/ci CI/CD, automated testing, etc. label Nov 20, 2024

lukesteensen added 6 commits November 21, 2024 18:13

try even bigger interner

80fad6b

Signed-off-by: Luke Steensen <[email protected]>

Merge branch 'main' into intern-metadata-strings

1f667fd

try massive interner

98891e5

Signed-off-by: Luke Steensen <[email protected]>

finally correct interner size

8274137

Signed-off-by: Luke Steensen <[email protected]>

cleanup

9ef1624

Signed-off-by: Luke Steensen <[email protected]>

Merge branch 'main' into intern-metadata-strings

b4c096c

Signed-off-by: Luke Steensen <[email protected]>

tobz approved these changes Dec 3, 2024

View reviewed changes

tobz changed the title ~~rearrange things to make sure we intern metadata strings~~ [APR-248] rearrange things to make sure we intern metadata strings Dec 3, 2024

lukesteensen merged commit ae38092 into main Dec 3, 2024
18 checks passed

tobz deleted the intern-metadata-strings branch December 13, 2024 01:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[APR-248] rearrange things to make sure we intern metadata strings #337

[APR-248] rearrange things to make sure we intern metadata strings #337

lukesteensen commented Nov 19, 2024

pr-commenter bot commented Nov 19, 2024 •

edited

Loading

Fine details of change detection per experiment

Explanation

pr-commenter bot commented Nov 19, 2024 •

edited

Loading

Fine details of change detection per experiment

Bounds Checks: ✅ Passed

Explanation

pr-commenter bot commented Nov 19, 2024 •

edited

Loading

tobz Nov 19, 2024

tobz commented Nov 19, 2024

lukesteensen commented Dec 2, 2024

tobz left a comment

[APR-248] rearrange things to make sure we intern metadata strings #337

[APR-248] rearrange things to make sure we intern metadata strings #337

Conversation

lukesteensen commented Nov 19, 2024

pr-commenter bot commented Nov 19, 2024 • edited Loading

Regression Detector (DogStatsD)

Regression Detector Results

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment

Bounds Checks: ❌ Failed

Explanation

pr-commenter bot commented Nov 19, 2024 • edited Loading

Regression Detector (Saluki)

Regression Detector Results

Optimization Goals: ❌ Significant changes detected

Fine details of change detection per experiment

Bounds Checks: ✅ Passed

Explanation

pr-commenter bot commented Nov 19, 2024 • edited Loading

Regression Detector Links

Experiment Result Links

tobz Nov 19, 2024

Choose a reason for hiding this comment

tobz commented Nov 19, 2024

lukesteensen commented Dec 2, 2024

tobz left a comment

Choose a reason for hiding this comment

pr-commenter bot commented Nov 19, 2024 •

edited

Loading

pr-commenter bot commented Nov 19, 2024 •

edited

Loading

pr-commenter bot commented Nov 19, 2024 •

edited

Loading