ValueMap interface change #2117

fraillt · 2024-09-14T14:59:45Z

Changes

Absolute minimum changes to ValueMap in order to provide interface that can be applied for all kinds of metrics.
I've tried to make this revision as small as possible, (leaving out optimization opportunities and bugs that noticed along the way), so it would be easier to review.
The benefit of this new interface is that it can be applied efficiently to all histograms. I have a proof in #2114 (which uses same interface) that it can be elegantly applied to ExpoHistogram as well.

Few more points/notes that might be helpful for reviewer:

Histogram might be a bit harder to review, because it required most changes, but essentially there's configuration extracted into new BucketsConfig type, and Aggregator is implemented for Mutex<Buckets<T>>, the rest is basically compiler-driven-development.
LastValue, Sum, PrecomputedSum should be trivial to review, I just had to implement Aggregator interface for Increment and Assign functionality, which is pretty trivial.
Changes to AtomicTracker and AtomicallyUpdate had to be made as well, either to get rid of "unused code" warning, or required for Increment and Assign. Generally these interfaces got simplified as well, as they no longer contain histogram-only related stuff.

I really want to unify common/important code so all metrics could benefit from it.
And I really hope this revision is small enough to review efficiently :) Happy reviewing :)

Merge requirement checklist

CONTRIBUTING guidelines followed
Unit tests added/updated (if applicable)
Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
Changes in public API reviewed (if applicable)

codecov · 2024-09-14T15:03:03Z

Codecov Report

Attention: Patch coverage is 97.75281% with 2 lines in your changes missing coverage. Please review.

Project coverage is 79.4%. Comparing base (e1860c7) to head (133f317).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
opentelemetry-sdk/src/metrics/internal/mod.rs	93.7%	2 Missing ⚠️

Additional details and impacted files

@@          Coverage Diff          @@
##            main   #2117   +/-   ##
=====================================
  Coverage   79.3%   79.4%           
=====================================
  Files        121     121           
  Lines      20968   20968           
=====================================
+ Hits       16646   16660   +14     
+ Misses      4322    4308   -14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

opentelemetry-sdk/src/metrics/internal/histogram.rs

opentelemetry-sdk/src/metrics/internal/sum.rs

fraillt · 2024-09-18T10:28:17Z

Thanks for review!
Here's a stress test results (I ran them for a while to stabilize a bit):

main branch - metrics_counter ~4.8M iter/sec

expand

Throughput: 4,784,400 iterations/sec
Throughput: 4,781,200 iterations/sec
Throughput: 4,814,800 iterations/sec
Throughput: 4,815,000 iterations/sec
Throughput: 4,829,400 iterations/sec
Throughput: 4,833,400 iterations/sec
Throughput: 4,827,400 iterations/sec
Throughput: 4,836,600 iterations/sec
Throughput: 4,838,600 iterations/sec
Throughput: 4,822,400 iterations/sec
Throughput: 4,830,000 iterations/sec
Throughput: 4,832,400 iterations/sec

this branch - metrics_counter ~5.2M iter/sec

expand

Throughput: 5,175,800 iterations/sec
Throughput: 5,176,200 iterations/sec
Throughput: 5,176,800 iterations/sec
Throughput: 5,182,800 iterations/sec
Throughput: 5,189,200 iterations/sec
Throughput: 5,193,600 iterations/sec
Throughput: 5,195,800 iterations/sec
Throughput: 5,189,400 iterations/sec
Throughput: 5,188,200 iterations/sec
Throughput: 5,184,800 iterations/sec
Throughput: 5,178,200 iterations/sec
Throughput: 5,184,800 iterations/sec

main - histograms ~4.8M iter/sec

expand

Throughput: 4,726,400 iterations/sec
Throughput: 4,754,200 iterations/sec
Throughput: 4,773,600 iterations/sec
Throughput: 4,790,800 iterations/sec
Throughput: 4,789,200 iterations/sec
Throughput: 4,802,400 iterations/sec
Throughput: 4,788,200 iterations/sec
Throughput: 4,797,600 iterations/sec
Throughput: 4,794,800 iterations/sec
Throughput: 4,784,400 iterations/sec
Throughput: 4,786,200 iterations/sec
Throughput: 4,798,400 iterations/sec
Throughput: 4,780,000 iterations/sec

this branch - metrics_histograms ~4.7M iter/sec

expand

Throughput: 4,820,400 iterations/sec
Throughput: 4,676,000 iterations/sec
Throughput: 4,718,800 iterations/sec
Throughput: 4,749,200 iterations/sec
Throughput: 4,730,000 iterations/sec
Throughput: 4,741,800 iterations/sec
Throughput: 4,744,000 iterations/sec
Throughput: 4,744,800 iterations/sec
Throughput: 4,739,000 iterations/sec
Throughput: 4,709,000 iterations/sec
Throughput: 4,741,200 iterations/sec

Generally it looks that results fluctuate between the "stress_test" runs , not sure why... I didn't look too deep but there's there's at least two sources of randomness (1. generating attribute sets, 2. initializing HashMap), maybe this might be the issue.
I tried to run these tests multiple times so general "feeling" is that histograms performance is probably the same... didn't noticed a difference.
But this branch really is faster with metrics_counter no matter how much I measure.

fraillt · 2024-09-18T10:39:41Z

removed redudant check from histogram.rs and exponential_histogram.rs.
I think it's not possible to get infinite or nan when doing conversion from integer.

let f_value = measurement.into_float();
if f_value.is_infinite() || f_value.is_nan() {
    return;
}

fraillt · 2024-09-18T15:31:33Z

I was wrong regarding f64 NaN and Infinity, because actual type can be f64, and user can pass in anything he likes :)
Also it looks like this check is missing in main branch for histogram so I guess this might be considered a bug fix?
I also added test for histogram for this specific case.

cijothomas · 2024-09-20T01:38:49Z

Thanks for review! Here's a stress test results (I ran them for a while to stabilize a bit):

main branch - metrics_counter ~4.8M iter/sec

expand
this branch - metrics_counter ~5.2M iter/sec

expand
main - histograms ~4.8M iter/sec

expand
this branch - metrics_histograms ~4.7M iter/sec

expand
Generally it looks that results fluctuate between the "stress_test" runs , not sure why... I didn't look too deep but there's there's at least two sources of randomness (1. generating attribute sets, 2. initializing HashMap), maybe this might be the issue. I tried to run these tests multiple times so general "feeling" is that histograms performance is probably the same... didn't noticed a difference. But this branch really is faster with metrics_counter no matter how much I measure.

I am seeing a different results. For metrics_histogram, my stress test drops significantly with this PR branch!! Benchmarks is not a lot different. This usually indicates that some contention is getting introduced. Maybe #2117 (comment) is the reason? Could you move the index calculation to be completely outside.

fraillt · 2024-09-20T09:29:17Z

Thanks for measuring it and sharing your results!
I did more stress tests and indeed it looks histograms are a bit slower.
Anyway, I finally understood where the issue was. There is two contention points: one for acquiring specific aggregator per attribute set (RwLock) and another for updating it (Mutex).
So I updated Aggregator interface to allow to precompute a value, so this should reduce contention for Histograms.
So when it comes to contention is should be exactly the same as on main branch, so performances should be identical (even though on main branch there are more function calls from measurement to update Counter metrics has extra unused parameter, but this should be optimized anyway).
I tried measuring with stress-test but it really is hard to say, for Counter metrics, sometimes it stabilize at 5.2M iter/s, another time at 4.1M iter/s, (and I tried to clear all the windows on my Ubuntu machine, to reduce noise).
My gut feeling is that it should be the same (if optimizer is smart to optimize main branch), but you're wellcome to test, maybe you find something more :)

Regarding performance in general I see at least few performance killers that would be easy to implement (and I would love to contribute).

change hashing function (maybe using rustc_hash crate). I really don't see a reason to have cryptographically secure hashing functions for attribute sets.
wrap attributes into some sort of HashOnce<T> wrapper, that would preserve hashing results.
implement some sort of sharding to reduce contention in multithreaded environment, something trivial like this might have big improvements.

let hash = attribs.get_hash();
let trackers = self.trackers[hash % 8]; // this number is a trade of between more memory OR less contention.
// do usual stuff like read_lock, get, write_lock etc...

Also collection phase needs some love too... which is another story :)

cijothomas · 2024-09-27T16:34:29Z

Regarding performance in general I see at least few performance killers that would be easy to implement (and I would love to contribute).

change hashing function (maybe using rustc_hash crate). I really don't see a reason to have cryptographically secure hashing functions for attribute sets.
wrap attributes into some sort of HashOnce wrapper, that would preserve hashing results.
implement some sort of sharding to reduce contention in multithreaded environment, something trivial like this might have big improvements.

Can you open separate issues so as to track this separately.

Hashing - if we have a faster hash, we should explore that. If it introduces possibility of trigger collisions with controlled input, then it should be under feature flag, so users can opt-in into that for higher perf.
I am not sure how to to achieve that. In hot path, hash is calculated only once today. Are you referring to optimizing non-hot path?
Agree that the existing contention can room for improvement, but it is generally hard to implement shardings. Here's link to a previous attempt : Adding two level hashing in metrics hashmap #1564 which was abandoned due to challenges getting it correct + we already reduced contention significantly with other changes. Happy to explore this further! (For comparison, OTel Rust's throughput is far lower than OTel .NET's throughput, though the latency is same, indicating that OTel Rust has more contentions. The key reason is .NET language has a built-in ConcurrentDictionary, whereas implementing such a thing in Rust would likely need unsafe code and/or rely on external crates. By default, we'd like to avoid unsafe and avoid external crates. But very open to adding them based on opt-in feature flags. (The linked PR shows that simply replacing hash with hashbrown boosted perf)

fraillt · 2024-09-29T18:52:45Z

Sure,I'll definitely create separate tickets for there optimizations, but before I do that, I want to make sure ValueMap can be applied to all metrics (measure and collection phase). This is important for at least few reasons:

makes sure that optimizations can be applied for all metrics (this revision is a good proof that this is possible)
easier to review and iterate/experiment, as optimization code will be localized to one class (ValueMap) instead of +5 extra places (all metrics).

I have 1 (this) PR and 2 issue, that I want to implement before implementing/experimenting with optimizations, so I don't want to create an issue now, because I'll not be able to work on it anyway.

linux-foundation-easycla · 2024-09-29T19:28:25Z

The committers listed above are authorized under a signed CLA.

✅ login: cijothomas / name: Cijo Thomas (1f54f90, 133f317)
✅ login: fraillt / name: Mindaugas Vinkelis (606d126)
✅ login: lalitb / name: Lalit Kumar Bhasin (5ff18ca, ffe0d4a)

cijothomas · 2024-10-03T17:03:37Z

Just ran perf test of my box:

Benchmarks for counter,histogram shows no difference in perf. (within noise level only).
Stress tests:
counter 12.4 (main) --> 13.2 (PR)
histogram 12.7 (main) --> 8.1 (PR)

lalitb · 2024-10-08T06:49:49Z

On my dev machine:

counter: main=11.7 pr=13.4
histogram: main=12.7 pr=8.1

fraillt · 2024-10-08T18:10:40Z

histogram: main=12.7 pr=8.1

This is insane difference... I wonder why this is so different on your platform compared to mine.
I wrote a script run_stress.sh

#!/usr/bin/env bash
for i in $(seq 1 10);
do
  git switch -
  timeout 60s cargo run --release --package stress --bin metrics_histogram
done

Then closed all applications, (to avoid any external noise) and went for a coffee for 10m :))

Here's the results

mv@mv-t14:~/Projects/opentelemetry-rust$ ./run_stress.sh Switched to branch 'value-map-interface-change' Your branch is up to date with 'origin/value-map-interface-change'. Compiling opentelemetry_sdk v0.26.0 (/home/mv/Projects/opentelemetry-rust/opentelemetry-sdk) Compiling stress v0.1.0 (/home/mv/Projects/opentelemetry-rust/stress) Finished `release` profile [optimized] target(s) in 3.02s Running `target/release/metrics_histogram` Number of threads: 12

Throughput: 5,077,200 iterations/sec

Throughput: 5,085,400 iterations/sec

Throughput: 5,081,000 iterations/sec

Throughput: 5,081,200 iterations/sec

Throughput: 4,509,600 iterations/sec

Throughput: 4,172,000 iterations/sec

Throughput: 4,189,000 iterations/sec

Throughput: 4,204,800 iterations/sec

Throughput: 4,220,800 iterations/sec

Throughput: 4,221,200 iterations/sec

Throughput: 4,230,400 iterations/sec

Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
Compiling opentelemetry_sdk v0.26.0 (/home/mv/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/home/mv/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 3.93s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 4,423,000 iterations/sec

Throughput: 4,401,000 iterations/sec

Throughput: 4,396,800 iterations/sec

Throughput: 4,409,800 iterations/sec

Throughput: 4,406,000 iterations/sec

Throughput: 4,432,800 iterations/sec

Throughput: 4,409,600 iterations/sec

Throughput: 4,451,600 iterations/sec

Throughput: 4,396,800 iterations/sec

Throughput: 4,402,200 iterations/sec

Throughput: 4,395,600 iterations/sec

Switched to branch 'value-map-interface-change'
Your branch is up to date with 'origin/value-map-interface-change'.
Compiling opentelemetry_sdk v0.26.0 (/home/mv/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/home/mv/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 4.09s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 4,510,000 iterations/sec

Throughput: 4,483,200 iterations/sec

Throughput: 4,492,400 iterations/sec

Throughput: 4,488,400 iterations/sec

Throughput: 4,490,800 iterations/sec

Throughput: 4,489,800 iterations/sec

Throughput: 4,499,200 iterations/sec

Throughput: 4,508,800 iterations/sec

Throughput: 4,518,000 iterations/sec

Throughput: 4,510,200 iterations/sec

Throughput: 4,514,000 iterations/sec

Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
Compiling opentelemetry_sdk v0.26.0 (/home/mv/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/home/mv/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 3.91s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 4,811,400 iterations/sec

Throughput: 4,780,400 iterations/sec

Throughput: 4,790,800 iterations/sec

Throughput: 4,782,800 iterations/sec

Throughput: 4,781,200 iterations/sec

Throughput: 4,754,600 iterations/sec

Throughput: 4,777,600 iterations/sec

Throughput: 4,785,800 iterations/sec

Throughput: 4,772,800 iterations/sec

Throughput: 4,786,000 iterations/sec

Throughput: 4,735,000 iterations/sec

Switched to branch 'value-map-interface-change'
Your branch is up to date with 'origin/value-map-interface-change'.
Compiling opentelemetry_sdk v0.26.0 (/home/mv/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/home/mv/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 4.02s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 4,493,200 iterations/sec

Throughput: 4,463,600 iterations/sec

Throughput: 4,472,000 iterations/sec

Throughput: 4,466,200 iterations/sec

Throughput: 4,470,000 iterations/sec

Throughput: 4,485,600 iterations/sec

Throughput: 4,482,200 iterations/sec

Throughput: 4,469,600 iterations/sec

Throughput: 4,498,200 iterations/sec

Throughput: 4,509,000 iterations/sec

Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
Compiling opentelemetry_sdk v0.26.0 (/home/mv/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/home/mv/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 3.90s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 5,150,600 iterations/sec

Throughput: 5,133,000 iterations/sec

Throughput: 5,132,400 iterations/sec

Throughput: 5,138,200 iterations/sec

Throughput: 5,127,600 iterations/sec

Throughput: 5,133,200 iterations/sec

Throughput: 5,133,800 iterations/sec

Throughput: 5,127,600 iterations/sec

Throughput: 5,138,600 iterations/sec

Throughput: 5,140,000 iterations/sec

Throughput: 5,147,000 iterations/sec

Switched to branch 'value-map-interface-change'
Your branch is up to date with 'origin/value-map-interface-change'.
Compiling opentelemetry_sdk v0.26.0 (/home/mv/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/home/mv/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 4.02s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 4,757,800 iterations/sec

Throughput: 4,736,200 iterations/sec

Throughput: 4,725,000 iterations/sec

Throughput: 4,736,400 iterations/sec

Throughput: 4,737,200 iterations/sec

Throughput: 4,737,400 iterations/sec

Throughput: 4,744,000 iterations/sec

Throughput: 4,702,000 iterations/sec

Throughput: 4,732,800 iterations/sec

Throughput: 4,737,800 iterations/sec

Throughput: 4,736,200 iterations/sec

Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
Compiling opentelemetry_sdk v0.26.0 (/home/mv/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/home/mv/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 3.90s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 4,414,800 iterations/sec

Throughput: 4,410,400 iterations/sec

Throughput: 4,375,800 iterations/sec

Throughput: 4,382,800 iterations/sec

Throughput: 4,379,200 iterations/sec

Throughput: 4,383,000 iterations/sec

Throughput: 4,422,800 iterations/sec

Throughput: 4,405,200 iterations/sec

Throughput: 4,385,000 iterations/sec

Throughput: 4,381,800 iterations/sec

Throughput: 4,378,800 iterations/sec

Switched to branch 'value-map-interface-change'
Your branch is up to date with 'origin/value-map-interface-change'.
Compiling opentelemetry_sdk v0.26.0 (/home/mv/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/home/mv/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 4.04s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 4,756,200 iterations/sec

Throughput: 4,727,400 iterations/sec

Throughput: 4,734,200 iterations/sec

Throughput: 4,729,000 iterations/sec

Throughput: 4,732,200 iterations/sec

Throughput: 4,732,000 iterations/sec

Throughput: 4,735,400 iterations/sec

Throughput: 4,736,000 iterations/sec

Throughput: 4,720,200 iterations/sec

Throughput: 4,721,600 iterations/sec

Throughput: 4,736,000 iterations/sec

Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
Compiling opentelemetry_sdk v0.26.0 (/home/mv/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/home/mv/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 3.90s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 4,688,600 iterations/sec

Throughput: 4,658,400 iterations/sec

Throughput: 4,676,400 iterations/sec

Throughput: 4,695,200 iterations/sec

Throughput: 4,672,600 iterations/sec

Throughput: 4,656,400 iterations/sec

Throughput: 4,657,000 iterations/sec

Throughput: 4,659,400 iterations/sec

Throughput: 4,660,000 iterations/sec

Throughput: 4,656,000 iterations/sec

Throughput: 4,674,400 iterations/sec

I would say that results are very similar, sometimes my branch is faster, sometimes main. Maybe main is on average a bit faster...? but not like 12k vs 8k...

fraillt · 2024-10-08T20:09:34Z

@cijothomas and @lalitb thanks for sharing your results!

I updated code by making it look as close as possible to what it was before (preserving existing bugs).
Unfortunatelly I cannot reproduce these results( see comment above), but I'm really confused why there is 50% slowdown for histograms :/
Maybe you have some insights?
Could you run these tests again (Even though I didn't change much, mostly move code around and restored few minor bugs)?

fraillt · 2024-10-11T14:47:28Z

I just ran tests on another computer
Stress tests:
counter 12.7 (main) --> 18.3 (PR)
histogram 12.6 (main) --> 11.4 (PR)

Here's the script that I used:

#!/usr/bin/env bash
echo "CPU $(cat /proc/cpuinfo | grep 'name' | uniq)"
for i in $(seq 1 2);
do
  echo "Current branch: $(git branch --show-current)"
  echo "stress-test metrics_histogram"
  timeout 30s cargo run --release --package stress --bin metrics_histogram
  echo "stress-test metrics"
  timeout 30s cargo run --release --package stress --bin metrics
  git switch -
done

Here's the full console output

fraillt@Fraillt-PC:~/HostProjects/opentelemetry-rust$ ./run_tests.sh
CPU model name : AMD Ryzen 5 3600 6-Core Processor
Current branch: main
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 7.95s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 12,594,800 iterations/sec

Throughput: 12,642,800 iterations/sec

Throughput: 12,645,600 iterations/sec

Throughput: 12,571,800 iterations/sec

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 4.21s
Running target/release/metrics
Number of threads: 12

Throughput: 13,018,000 iterations/sec

Throughput: 13,011,600 iterations/sec

Throughput: 12,922,800 iterations/sec

Throughput: 12,231,200 iterations/sec

Throughput: 11,879,200 iterations/sec

Switched to branch 'value-map-interface-change'
Your branch is up to date with 'origin/value-map-interface-change'.
Current branch: value-map-interface-change
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 8.37s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 11,383,000 iterations/sec

Throughput: 11,422,600 iterations/sec

Throughput: 11,432,800 iterations/sec

Throughput: 11,428,800 iterations/sec

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 4.11s
Running target/release/metrics
Number of threads: 12

Throughput: 18,337,200 iterations/sec

Throughput: 18,394,400 iterations/sec

Throughput: 18,224,400 iterations/sec

Throughput: 18,356,200 iterations/sec

Throughput: 18,334,400 iterations/sec

Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
fraillt@Fraillt-PC:~/HostProjects/opentelemetry-rust$

fraillt · 2024-10-13T16:39:26Z

I think I know the reason for these results.
It looks that different compiler version has really different results.
I have ran these tests using several different compiler versions, and here's the summary:

Rust version: 1.70

counter 17.3 (main) --> 12.7 (PR)
histogram 16.0 (main) --> 16.5 (PR)

Rust version: 1.72

counter 12.2 (main) --> 12.4 (PR)
histogram 12.0 (main) --> 12.0 (PR)

Rust version: 1.75

counter 12.9 (main) --> 12.4 (PR)
histogram 12.5 (main) --> 12.4 (PR)

Rust version: 1.78

counter 12.9 (main) --> 17.7 (PR)
histogram 12.5 (main) --> 11.3 (PR)

Rust version: 1.81

counter 12.5 (main) --> 18.0 (PR)
histogram 12.4 (main) --> 11.2 (PR)

script outcome

fraillt@Fraillt-PC:~/HostProjects/opentelemetry-rust$ ./run_tests.sh
CPU model name : AMD Ryzen 5 3600 6-Core Processor
info: using existing install for '1.70-x86_64-unknown-linux-gnu'
info: default toolchain set to '1.70-x86_64-unknown-linux-gnu'

1.70-x86_64-unknown-linux-gnu unchanged - rustc 1.70.0 (90c541806 2023-05-31)

Rust version: 1.70
Current branch: value-map-interface-change
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 10.72s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 16,570,200 iterations/sec
Throughput: 16,536,400 iterations/sec
Throughput: 16,557,400 iterations/sec

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 5.39s
Running target/release/metrics
Number of threads: 12

Throughput: 12,781,200 iterations/sec
Throughput: 12,685,600 iterations/sec
Throughput: 12,754,200 iterations/sec
Throughput: 12,713,400 iterations/sec

Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
Current branch: main
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 10.00s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 16,122,600 iterations/sec
Throughput: 16,096,600 iterations/sec
Throughput: 16,156,000 iterations/sec

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 5.51s
Running target/release/metrics
Number of threads: 12

Throughput: 17,320,600 iterations/sec
Throughput: 17,350,600 iterations/sec
Throughput: 17,322,600 iterations/sec
Throughput: 17,328,600 iterations/sec

Switched to branch 'value-map-interface-change'
Your branch is up to date with 'origin/value-map-interface-change'.
info: using existing install for '1.72-x86_64-unknown-linux-gnu'
info: default toolchain set to '1.72-x86_64-unknown-linux-gnu'

1.72-x86_64-unknown-linux-gnu unchanged - rustc 1.72.1 (d5c2e9c34 2023-09-13)

Rust version: 1.72
Current branch: value-map-interface-change
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 10.46s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 12,094,400 iterations/sec
Throughput: 12,040,400 iterations/sec
Throughput: 12,075,600 iterations/sec

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 5.01s
Running target/release/metrics
Number of threads: 12

Throughput: 12,431,200 iterations/sec
Throughput: 12,466,600 iterations/sec
Throughput: 12,448,400 iterations/sec
Throughput: 12,450,000 iterations/sec

Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
Current branch: main
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 9.70s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 12,051,000 iterations/sec
Throughput: 11,986,400 iterations/sec
Throughput: 12,063,000 iterations/sec
Throughput: 12,068,000 iterations/sec

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 5.12s
Running target/release/metrics
Number of threads: 12

Throughput: 12,259,200 iterations/sec
Throughput: 12,269,800 iterations/sec
Throughput: 12,273,200 iterations/sec
Throughput: 12,266,200 iterations/sec

Switched to branch 'value-map-interface-change'
Your branch is up to date with 'origin/value-map-interface-change'.
info: using existing install for '1.75-x86_64-unknown-linux-gnu'
info: default toolchain set to '1.75-x86_64-unknown-linux-gnu'

1.75-x86_64-unknown-linux-gnu unchanged - rustc 1.75.0 (82e1608df 2023-12-21)

Rust version: 1.75
Current branch: value-map-interface-change
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 9.93s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 12,312,000 iterations/sec
Throughput: 12,450,000 iterations/sec
Throughput: 12,433,400 iterations/sec

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 4.80s
Running target/release/metrics
Number of threads: 12

Throughput: 12,505,000 iterations/sec
Throughput: 12,544,400 iterations/sec
Throughput: 12,517,000 iterations/sec
Throughput: 12,484,200 iterations/sec
Throughput: 12,327,400 iterations/sec

Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
Current branch: main
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 9.27s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 12,621,400 iterations/sec
Throughput: 12,229,400 iterations/sec
Throughput: 12,538,400 iterations/sec
Throughput: 12,670,000 iterations/sec

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release [optimized] target(s) in 4.74s
Running target/release/metrics
Number of threads: 12

Throughput: 12,907,600 iterations/sec
Throughput: 12,931,200 iterations/sec
Throughput: 12,910,000 iterations/sec
Throughput: 12,938,000 iterations/sec
Throughput: 12,957,200 iterations/sec

Switched to branch 'value-map-interface-change'
Your branch is up to date with 'origin/value-map-interface-change'.
info: using existing install for '1.78-x86_64-unknown-linux-gnu'
info: default toolchain set to '1.78-x86_64-unknown-linux-gnu'

1.78-x86_64-unknown-linux-gnu unchanged - rustc 1.78.0 (9b00956e5 2024-04-29)

Rust version: 1.78
Current branch: value-map-interface-change
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 9.19s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 11,197,800 iterations/sec
Throughput: 11,416,000 iterations/sec
Throughput: 11,394,800 iterations/sec
Throughput: 9,392,800 iterations/sec <--- open browser

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 4.96s
Running target/release/metrics
Number of threads: 12

Throughput: 17,282,200 iterations/sec
Throughput: 16,250,200 iterations/sec
Throughput: 17,872,600 iterations/sec
Throughput: 17,867,400 iterations/sec

Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
Current branch: main
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 8.87s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 12,465,200 iterations/sec
Throughput: 12,610,600 iterations/sec
Throughput: 12,596,200 iterations/sec
Throughput: 12,080,200 iterations/sec

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 4.79s
Running target/release/metrics
Number of threads: 12

Throughput: 12,887,200 iterations/sec
Throughput: 12,956,800 iterations/sec
Throughput: 12,624,000 iterations/sec
Throughput: 12,252,000 iterations/sec
Throughput: 12,381,400 iterations/sec

Switched to branch 'value-map-interface-change'
Your branch is up to date with 'origin/value-map-interface-change'.
info: using existing install for '1.81-x86_64-unknown-linux-gnu'
info: default toolchain set to '1.81-x86_64-unknown-linux-gnu'

1.81-x86_64-unknown-linux-gnu unchanged - rustc 1.81.0 (eeb90cda1 2024-09-04)

Rust version: 1.81
Current branch: value-map-interface-change
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 8.93s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 11,088,800 iterations/sec
Throughput: 11,227,200 iterations/sec
Throughput: 11,198,600 iterations/sec
Throughput: 11,189,600 iterations/sec

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 4.64s
Running target/release/metrics
Number of threads: 12

Throughput: 18,036,200 iterations/sec
Throughput: 18,069,200 iterations/sec
Throughput: 18,066,200 iterations/sec
Throughput: 18,009,000 iterations/sec
Throughput: 18,037,600 iterations/sec

Switched to branch 'main'
Your branch is up to date with 'upstream/main'.
Current branch: main
stress-test metrics_histogram
Compiling opentelemetry_sdk v0.26.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/opentelemetry-sdk)
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 8.67s
Running target/release/metrics_histogram
Number of threads: 12

Throughput: 12,169,800 iterations/sec
Throughput: 12,409,200 iterations/sec
Throughput: 12,436,000 iterations/sec
Throughput: 11,985,000 iterations/sec

stress-test metrics
Compiling stress v0.1.0 (/mnt/c/Users/frail/Projects/opentelemetry-rust/stress)
Finished release profile [optimized] target(s) in 5.12s
Running target/release/metrics
Number of threads: 12

Throughput: 12,572,400 iterations/sec
Throughput: 12,559,000 iterations/sec
Throughput: 12,581,800 iterations/sec
Throughput: 12,460,600 iterations/sec

Switched to branch 'value-map-interface-change'
Your branch is up to date with 'origin/value-map-interface-change'.
fraillt@Fraillt-PC:~/HostProjects/opentelemetry-rust$

So how should we proceed?

opentelemetry-sdk/src/metrics/internal/histogram.rs

lalitb · 2024-10-15T22:25:33Z

This is weird, I see different results now on the same machine, atlease don't see the degradation earlier observed with histogram. Would wait for someone else to confirm too. @fraillt - can you also gather the stats without using script (running manually), for these two rust versions in both release and debug(default) mode?

CPU: CPU model name : AMD EPYC 7763 64-Core Processor, 8 cores
Memory: 62G

rustc version: 1.70.0

histogram:
main: ~10.3K (release), ~5.2K (debug)
PR: ~10.8K (release), ~5.2K (debug)

counter:
main: ~10.6K (release) , ~5.1K (debug)
PR: ~11.2K (release), ~5.1K (debug)

rustc version: 1.81

histogram:
main: ~10.5K (release), ~5.2K (debug)
PR: ~10.5K (release), ~5.2K (debug)

counter:
main: ~10.6K (release) , ~5.1K (debug)
PR: ~11.1K (release), ~5.1K (debug)

fraillt · 2024-10-16T07:05:38Z

CPU model name : AMD Ryzen 5 3600 6-Core Processor

rustc version: 1.70.0

histogram:
main: ~15.9 (release), ~3.3 (debug)
PR: ~16.3 (release), ~3.6 (debug)

counter:
main: ~17.3 (release) , ~4.1 (debug)
PR: ~12.6 (release), ~3.7 (debug)

rustc version: 1.81

histogram:
main: ~12.6 (release), ~3.1 (debug)
PR: ~11.6 (release), ~3.2 (debug)

counter:
main: ~12.3 (release) , ~3.4 (debug)
PR: ~18.2 (release), ~4.0 (debug)

Rust version 1.70 with counter metrics is super weird... :/

fraillt · 2024-10-16T07:19:23Z

Thanks @lalitb for your tests

This is weird, I see different results now on the same machine, atlease don't see the degradation earlier observed with histogram.

I have mentioned that

I updated code by making it look as close as possible to what it was before (preserving existing bugs).

Basically, the only functional change is that I removed

        if f_value.is_infinite() || f_value.is_nan() {
            return;
        }

Which is probably ok, since histograms are unbounded anyway...
It's wierd, because personally on my machine I didn't observe any difference (if I check infinity+nan or not), but maybe your machine is different...

Maybe you can test this explicitly (by adding this check in fn measure) and see if this will cause performance degradation?

cijothomas · 2024-10-30T02:54:14Z

@fraillt @utpilla I think we can merge this PR, irrespective of the unexplainable stress test result variation. The changes are mostly internal, and we can keep optimizing this further.
I'll do another review, and mark approval today.

utpilla · 2024-10-30T20:15:06Z

opentelemetry-sdk/src/metrics/internal/mod.rs

+    /// Some aggregators can do some computations before updating aggregator.
+    /// This helps to reduce contention for aggregators because it makes
+    /// [`Aggregator::update`] as short as possible.
+    type PreComputedValue;


nit: PrecomputedValue is not the best name as there is no precomputation happening for counters and gauges.

I think we could use something like MeasurementData instead.

Suggested change

type PreComputedValue;

type MeasurementData;

utpilla · 2024-10-30T20:30:25Z

opentelemetry-sdk/src/metrics/internal/histogram.rs

+        // Ignore NaN and infinity.
+        // Only makes sense if T is f64, maybe this could be no-op for other cases?
+        // TODO: uncomment once we know the reason for performance degradation
+        // if f.is_infinite() || f.is_nan() {


@cijothomas Let's add this check as well?

cijothomas

Thanks for patiently waiting!

fraillt requested a review from a team September 14, 2024 14:59

fraillt mentioned this pull request Sep 14, 2024

Unify Histogram and ExpHistogram aggregation #2114

Closed

4 tasks

utpilla reviewed Sep 17, 2024

View reviewed changes

opentelemetry-sdk/src/metrics/internal/histogram.rs Outdated Show resolved Hide resolved

utpilla reviewed Sep 17, 2024

View reviewed changes

opentelemetry-sdk/src/metrics/internal/sum.rs Outdated Show resolved Hide resolved

fraillt force-pushed the value-map-interface-change branch from 825d181 to 125c3fe Compare September 18, 2024 09:09

fraillt force-pushed the value-map-interface-change branch from 125c3fe to 9f87703 Compare September 18, 2024 10:37

fraillt force-pushed the value-map-interface-change branch from 9f87703 to 6968110 Compare September 18, 2024 15:28

fraillt force-pushed the value-map-interface-change branch from 6968110 to 43d1878 Compare September 20, 2024 08:30

fraillt requested a review from a team as a code owner September 20, 2024 08:30

fraillt mentioned this pull request Sep 25, 2024

Apply ValueMap for ExpoHistogram metric #2146

Open

fraillt force-pushed the value-map-interface-change branch 2 times, most recently from bd47a96 to bf24bca Compare September 29, 2024 18:59

fraillt force-pushed the value-map-interface-change branch from badd0c6 to c52c479 Compare September 29, 2024 19:29

fraillt force-pushed the value-map-interface-change branch from 713bb20 to a503fef Compare October 8, 2024 14:36

fraillt force-pushed the value-map-interface-change branch 2 times, most recently from a45d153 to c1b4d15 Compare October 8, 2024 20:03

fraillt force-pushed the value-map-interface-change branch from c1b4d15 to 6420f8d Compare October 8, 2024 20:19

ValueMap interface change

606d126

fraillt force-pushed the value-map-interface-change branch from 6420f8d to 606d126 Compare October 11, 2024 14:31

Merge branch 'main' into value-map-interface-change

5ff18ca

ThomsonTan reviewed Oct 15, 2024

View reviewed changes

opentelemetry-sdk/src/metrics/internal/histogram.rs Show resolved Hide resolved

Merge branch 'main' into value-map-interface-change

1f54f90

fraillt mentioned this pull request Oct 25, 2024

Metrics collect stress test #2247

Open

4 tasks

Merge branch 'main' into value-map-interface-change

ffe0d4a

utpilla reviewed Oct 30, 2024

View reviewed changes

utpilla approved these changes Oct 30, 2024

View reviewed changes

Merge branch 'main' into value-map-interface-change

133f317

cijothomas approved these changes Nov 1, 2024

View reviewed changes

cijothomas merged commit 706a067 into open-telemetry:main Nov 1, 2024
23 of 25 checks passed

fraillt deleted the value-map-interface-change branch November 2, 2024 13:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueMap interface change #2117

ValueMap interface change #2117

fraillt commented Sep 14, 2024 •

edited

Loading

codecov bot commented Sep 14, 2024 •

edited

Loading

fraillt commented Sep 18, 2024

fraillt commented Sep 18, 2024

fraillt commented Sep 18, 2024

cijothomas commented Sep 20, 2024

fraillt commented Sep 20, 2024

cijothomas commented Sep 27, 2024

fraillt commented Sep 29, 2024

linux-foundation-easycla bot commented Sep 29, 2024 •

edited

Loading

cijothomas commented Oct 3, 2024

lalitb commented Oct 8, 2024

fraillt commented Oct 8, 2024

fraillt commented Oct 8, 2024

fraillt commented Oct 11, 2024

fraillt commented Oct 13, 2024

lalitb commented Oct 15, 2024

fraillt commented Oct 16, 2024 •

edited

Loading

fraillt commented Oct 16, 2024

cijothomas commented Oct 30, 2024

utpilla Oct 30, 2024

utpilla Oct 30, 2024

cijothomas left a comment

ValueMap interface change #2117

ValueMap interface change #2117

Conversation

fraillt commented Sep 14, 2024 • edited Loading

Changes

Merge requirement checklist

codecov bot commented Sep 14, 2024 • edited Loading

Codecov Report

fraillt commented Sep 18, 2024

fraillt commented Sep 18, 2024

fraillt commented Sep 18, 2024

cijothomas commented Sep 20, 2024

fraillt commented Sep 20, 2024

cijothomas commented Sep 27, 2024

fraillt commented Sep 29, 2024

linux-foundation-easycla bot commented Sep 29, 2024 • edited Loading

cijothomas commented Oct 3, 2024

lalitb commented Oct 8, 2024

fraillt commented Oct 8, 2024

fraillt commented Oct 8, 2024

fraillt commented Oct 11, 2024

fraillt commented Oct 13, 2024

lalitb commented Oct 15, 2024

fraillt commented Oct 16, 2024 • edited Loading

fraillt commented Oct 16, 2024

cijothomas commented Oct 30, 2024

utpilla Oct 30, 2024

Choose a reason for hiding this comment

utpilla Oct 30, 2024

Choose a reason for hiding this comment

cijothomas left a comment

Choose a reason for hiding this comment

fraillt commented Sep 14, 2024 •

edited

Loading

codecov bot commented Sep 14, 2024 •

edited

Loading

linux-foundation-easycla bot commented Sep 29, 2024 •

edited

Loading

fraillt commented Oct 16, 2024 •

edited

Loading