Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APR-248] rearrange things to make sure we intern metadata strings #337

Merged
merged 9 commits into from
Dec 3, 2024

Conversation

lukesteensen
Copy link
Contributor

This is a bit rough around the edges still, but opening to get feedback. Things that are currently awkward:

  1. We test metric metadata in the deserializer (parser) tests, but there are fields somewhat outside the scope of the parser (imo) like process_id, origin, etc. This means we're either making up data to fill the tests or duplicating logic from the source.
  2. Origin is an optional field in the metadata but we set it unconditionally (this seems like it was for convenience in tests)
  3. A few metadata fields were using empty strings to signal None (maybe an optimization? It can be put back).
  4. Some of the builder-style APIs on metadata (e.g. set_container_id) very effectively hid places where we should have been interning strings via Into<MetaString>, which is a bit of a performance footgun (arguably correctness for bounding as well).
  5. I probably overcorrected with pub fields, but it was the easiest way to get this working for now.

None of this should be terribly difficult to sort out, and there's room for some nice simplification once we do, but wanted to make sure folks are on the same page before making any real design decisions.

@lukesteensen lukesteensen requested review from a team as code owners November 19, 2024 02:26
@github-actions github-actions bot added area/core Core functionality, event model, etc. area/io General I/O and networking. area/components Sources, transforms, and destinations. source/dogstatsd DogStatsD source. labels Nov 19, 2024
@pr-commenter
Copy link

pr-commenter bot commented Nov 19, 2024

Regression Detector (DogStatsD)

Regression Detector Results

Run ID: 2e655b81-0bd2-4256-bd38-7b6830f3d22a

Baseline: 7.59.0
Comparison: 7.59.0

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI trials links
dsd_uds_100mb_3k_contexts_distributions_only memory utilization +0.69 [+0.51, +0.87] 1
dsd_uds_500mb_3k_contexts ingress throughput +0.03 [+0.02, +0.05] 1
dsd_uds_10mb_3k_contexts ingress throughput +0.01 [-0.00, +0.03] 1
dsd_uds_1mb_50k_contexts_memlimit ingress throughput +0.00 [-0.00, +0.00] 1
dsd_uds_1mb_3k_contexts_dualship ingress throughput +0.00 [-0.00, +0.00] 1
dsd_uds_100mb_250k_contexts ingress throughput +0.00 [-0.00, +0.00] 1
dsd_uds_1mb_50k_contexts ingress throughput -0.00 [-0.00, +0.00] 1
dsd_uds_512kb_3k_contexts ingress throughput -0.00 [-0.01, +0.01] 1
dsd_uds_1mb_3k_contexts ingress throughput -0.00 [-0.00, +0.00] 1
dsd_uds_100mb_3k_contexts ingress throughput -0.00 [-0.05, +0.04] 1
quality_gates_idle_rss memory utilization -1.33 [-1.46, -1.21] 1

Bounds Checks: ❌ Failed

perf experiment bounds_check_name replicates_passed links
quality_gates_idle_rss memory_usage 0/10

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".

@pr-commenter
Copy link

pr-commenter bot commented Nov 19, 2024

Regression Detector (Saluki)

Regression Detector Results

Run ID: 738c41c5-81e6-4240-b7f2-b6b756f7abdf

Baseline: ec30b3d
Comparison: b4c096c
Diff

Optimization Goals: ❌ Significant changes detected

perf experiment goal Δ mean % Δ mean % CI trials links
quality_gates_idle_rss memory utilization -19.33 [-19.69, -18.96] 1

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI trials links
dsd_uds_50mb_10k_contexts_no_inlining ingress throughput +0.01 [-0.07, +0.08] 1
dsd_uds_512kb_3k_contexts ingress throughput +0.00 [-0.01, +0.01] 1
dsd_uds_1mb_50k_contexts ingress throughput +0.00 [-0.02, +0.02] 1
dsd_uds_50mb_10k_contexts_no_inlining_no_allocs ingress throughput -0.00 [-0.06, +0.06] 1
dsd_uds_100mb_3k_contexts ingress throughput -0.00 [-0.06, +0.05] 1
dsd_uds_1mb_3k_contexts_dualship ingress throughput -0.00 [-0.01, +0.00] 1
dsd_uds_100mb_250k_contexts ingress throughput -0.01 [-0.04, +0.02] 1
dsd_uds_1mb_3k_contexts ingress throughput -0.02 [-0.03, +0.00] 1
dsd_uds_10mb_3k_contexts ingress throughput -0.03 [-0.06, +0.01] 1
dsd_uds_500mb_3k_contexts ingress throughput -0.25 [-0.36, -0.14] 1
dsd_uds_1mb_50k_contexts_memlimit ingress throughput -2.57 [-4.41, -0.73] 1
dsd_uds_100mb_3k_contexts_distributions_only memory utilization -4.17 [-4.43, -3.90] 1
quality_gates_idle_rss memory utilization -19.33 [-19.69, -18.96] 1

Bounds Checks: ✅ Passed

perf experiment bounds_check_name replicates_passed links
quality_gates_idle_rss memory_usage 10/10

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".

@pr-commenter
Copy link

pr-commenter bot commented Nov 19, 2024

Regression Detector Links

Experiment Result Links

experiment link(s)
dsd_uds_100mb_250k_contexts [Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts [Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts_distributions_only [Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_10mb_3k_contexts [Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_3k_contexts [Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_3k_contexts_dualship [Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts [Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts_memlimit [Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_500mb_3k_contexts [Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_512kb_3k_contexts [Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
quality_gates_idle_rss [Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_50mb_10k_contexts_no_inlining (ADP only) [Profiling (ADP)] [SMP Dashboard]
dsd_uds_50mb_10k_contexts_no_inlining_no_allocs (ADP only) [Profiling (ADP)] [SMP Dashboard]

self.0.split_once(':').map(|(_, value)| value)
}

/// Gets the name and value of the tag.
///
/// For bare tags (e.g. `production`), this always returns `(Some(...), None)`.
pub fn name_and_value(&self) -> (Option<&str>, Option<&str>) {
pub fn name_and_value(&self) -> (&'a str, Option<&'a str>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@tobz
Copy link
Member

tobz commented Nov 19, 2024

  1. We test metric metadata in the deserializer (parser) tests, but there are fields somewhat outside the scope of the parser (imo) like process_id, origin, etc. This means we're either making up data to fill the tests or duplicating logic from the source.

I agree that the parser really shouldn't care except for knowing how to adjust the raw input payload in order to exercise those codepaths. I do think the tests are valuable overall: for example, we have all the various permutations because of a real bug I caught where I broke how optional fields are parsed which didn't show up unless the optional fields were in a certain order.

I'm ambivalent to where you place the tests, but it does feel like we need to keep them in spirit to avoid regressing the codec.

  1. Origin is an optional field in the metadata but we set it unconditionally (this seems like it was for convenience in tests)

I'm guessing I made this optional trying to futureproof for a scenario where we weren't deal with Agent metrics. I'm fine with making it required until that need actually arises.

  1. A few metadata fields were using empty strings to signal None (maybe an optimization? It can be put back).

This was a hack to to have Option<T> semantics without the size increase of actually having the fields be Option<MetaString>. I'm not strongly tied to having accessors that return Option<&str>, though, so fine to eliminate these for now.

  1. Some of the builder-style APIs on metadata (e.g. set_container_id) very effectively hid places where we should have been interning strings via Into<MetaString>, which is a bit of a performance footgun (arguably correctness for bounding as well).
  1. I probably overcorrected with pub fields, but it was the easiest way to get this working for now.

This is a good point. My brain personally enjoys builder APIs, but I'm not strongly tied to them. A middleground could be that we just make them more restricted by requiring MetaString instead of transparently converting them... but again, I'm not strongly tied to the builderAPI.

My inclination would be to have a builder API if we do end up with a good number of optional fields to avoid massive, multi-line struct initializers... that's it, really.

Overall, this approach seems fine to me. 👍🏻

@lukesteensen lukesteensen force-pushed the intern-metadata-strings branch from 6bf270a to 92668d0 Compare November 19, 2024 18:21
@github-actions github-actions bot added the area/ci CI/CD, automated testing, etc. label Nov 20, 2024
@lukesteensen
Copy link
Contributor Author

@tobz

I do think the tests are valuable overall: for example, we have all the various permutations because of a real bug I caught where I broke how optional fields are parsed which didn't show up unless the optional fields were in a certain order.

I'm ambivalent to where you place the tests, but it does feel like we need to keep them in spirit to avoid regressing the codec.

Strongly agree! I wouldn't get rid of any of the tests, was just trying to figure out a good way to split out the dirty external stuff (i.e. connection address) from the pure parsing concerns and organize the tests accordingly. But that's something we can maybe follow up on.

I'm guessing I made this optional trying to futureproof for a scenario where we weren't deal with Agent metrics. I'm fine with making it required until that need actually arises.

Not a big deal, this makes sense.

This was a hack to to have Option semantics without the size increase of actually having the fields be Option. I'm not strongly tied to having accessors that return Option<&str>, though, so fine to eliminate these for now.

Noticed this regression and backed out the change.

This is a good point. My brain personally enjoys builder APIs, but I'm not strongly tied to them. A middleground could be that we just make them more restricted by requiring MetaString instead of transparently converting them... but again, I'm not strongly tied to the builder API.

My inclination would be to have a builder API if we do end up with a good number of optional fields to avoid massive, multi-line struct initializers... that's it, really.

I'm good with that too, I just think the current implementation via setters/getters ends up encouraging/allowing mutation in a way that's limiting for sharing metadata, for example. Just having separate builder structs would fix that.

Copy link
Member

@tobz tobz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

I think what we might want to consider in the future is simply changing the DSD codec tests to deal with MetricPacket<'a> instead of Metric when parsing.

As nice as it would be to just have codecs only return Event/Event-compatible types, that's clearly not the way to approach things when we need to obsess over performance/efficiency, so we can probably afford to be more sparse in the tests just to avoid polluting them with too much setup/boilerplate.

Nothing you need to do here unless you really want to.

@tobz tobz changed the title rearrange things to make sure we intern metadata strings [APR-248] rearrange things to make sure we intern metadata strings Dec 3, 2024
@lukesteensen lukesteensen merged commit ae38092 into main Dec 3, 2024
18 checks passed
@tobz tobz deleted the intern-metadata-strings branch December 13, 2024 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci CI/CD, automated testing, etc. area/components Sources, transforms, and destinations. area/core Core functionality, event model, etc. area/io General I/O and networking. source/dogstatsd DogStatsD source.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants