-
Notifications
You must be signed in to change notification settings - Fork 82
240105 code cleanup #47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There seems to be no use for the numlist object
percentiles in CSV file were incorrectly divided by MILLION, resulting in mostly 0 values. Remove the divisor. Probably the feature was never used, otherwise it would have been noticed.
histogram methods were implemented as virtual functions, but since there is only one possible implementation this was overkill. Simplify the code by exposing the actual methods. The implementation still remains opaque. No functional changes. Tested with ./tcp_rr -c -H 127.0.0.1 -p1,2,10,50,90,999.9999,100 -A/tmp/x.csv -l 4 and verified that the csv file has the correct data. (histograms are only exercised in rr tests)
histograms store samples in buckets with pseudo logarithmic size. The previous implementation used a table of thresholds, and binary search to locate the correct bucket. This patch replaces the thresholds with the fast pseudo-logarithm algorithm used in lr-cstats and bpftrace so we can locate the bucket in a handful of instructions. This gives memory savings, reduced cache trashing, and better performance. Tests show that with a hot cache a lookup now takes less than 2us compared to 20-25us with the previous approach. Also, we can remove the now-useless neper_histo_factory. The actual resolution of the buckets is approximately the same as in the previous implementation (about 1.5%). On passing, correct a few bugs in the previous implementation: - resolution was supposed to be 0.25% but due to an implementation bug it was around 1% or even bigger at low values, and cause the thresholds to become negative - conversion from double to int for the sample could have unchecked overflows. Tested with tcp_rr and verifying that the distribution and csv files contain correct values.
Allow arbitrary percentiles to be specificed, instead of just integer and p99.9 and p99.99 This also makes the code faster because we can just compute the values requested instead of all 103 entries. Any floating point number between 0 and 100 is now accepted, with 999 and 9999 mapped to 99.9 and 99.99 for backward compatibility. Tested as usual with ./tcp_rr -c -H 127.0.0.1 -p1,2,10,50,90,999,9999,100 -A/tmp/x.csv and verifying the correct content of the csv file.
Computing percentiles is expensive, as it requires scanning all the 4k-8k buckets used to store samples, and is done for each flow. Benchmarks show the original code took an average of 20us per flow, with frequent peaks in the 60-80us range. This patch eliminates the cost by not storing samples in buckets if no percentiles are requested, and otherwise achieves a ~5x reduction by tracking the range of buckets that contain values in each epoch. Also change the precision to 6 bits, which halves the cost without much impact on the results. This value may become a command line flag. Tested, as usual, by running tcp_rr and verifying the logs and csv
neper_snaps methods were implemented as virtual functions, but since there is only one possible implementation this was overkill. Simplify the code by exposing the actual methods. The implementation still remains opaque. No functional changes. Tested with ./tcp_rr -c -H 127.0.0.1 -p1,2,10,50,90,999.9999,100 -A/tmp/x.csv -l 4 and verified that the csv file has the correct data. (histograms are only exercised in rr tests)
Short description of the patches:
|
No description provided.