Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read 128 bit trace ID in End Invocation #30398

Merged

Conversation

agocs
Copy link
Contributor

@agocs agocs commented Oct 23, 2024

What does this PR do?

@duncanista pointed out we also need to read the upper 64 bits of a trace ID in the End Invocation flow. This is a quick hack to add that. The right way, of course, would be using tracer.Inject and tracer.Extract.

Motivation

Describe how to test/QA your changes

I created an AWS Step Function and Java, .Net, and Go lambda functions. The step function invokes the lambda functions. I instrumented the Lambda functions with the Lambda Extension and Universal Instrumentation. I then invoked the step function and verified that the Lambda functions correctly read the distributed trace context from the step function and passed that trace context back and forth between the Extension and the tracer.

image

Possible Drawbacks / Trade-offs

Additional Notes

@agocs agocs requested review from a team as code owners October 23, 2024 05:24
@agent-platform-auto-pr
Copy link
Contributor

agent-platform-auto-pr bot commented Oct 23, 2024

Test changes on VM

Use this command from test-infra-definitions to manually test this PR changes on a VM:

inv create-vm --pipeline-id=48492969 --os-family=ubuntu

Note: This applies to commit 78cb994

Copy link

cit-pr-commenter bot commented Oct 23, 2024

Regression Detector

@agocs
Copy link
Contributor Author

agocs commented Nov 4, 2024

/merge

@dd-devflow
Copy link

dd-devflow bot commented Nov 4, 2024

🚂 MergeQueue: waiting for PR to be ready

This merge request is not mergeable yet, because of pending checks/missing approvals. It will be added to the queue as soon as checks pass and/or get approvals.
Note: if you pushed new commits since the last approval, you may need additional approval.
You can remove it from the waiting list with /remove command.

Use /merge -c to cancel this operation!

@agocs
Copy link
Contributor Author

agocs commented Nov 4, 2024

/remove

@dd-devflow
Copy link

dd-devflow bot commented Nov 4, 2024

🚂 Devflow: /remove

@dd-devflow
Copy link

dd-devflow bot commented Nov 4, 2024

⚠️ MergeQueue: This merge request was unqueued

This merge request was unqueued

If you need support, contact us on Slack #devflow!

@agocs agocs requested a review from a team as a code owner November 6, 2024 19:10
@github-actions github-actions bot added the short review PR is simple enough to be reviewed quickly label Nov 6, 2024
Copy link
Contributor

@buraizu buraizu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with a minor update requested

@github-actions github-actions bot added medium review PR review might take time and removed short review PR is simple enough to be reviewed quickly labels Nov 6, 2024
Copy link
Contributor

github-actions bot commented Nov 7, 2024

Serverless Benchmark Results

BenchmarkStartEndInvocation comparison between 0c2f3b0 and bacb560.

tl;dr

Use these benchmarks as an insight tool during development.

  1. Skim down the vs base column in each chart. If there is a ~, then there was no statistically significant change to the benchmark. Otherwise, ensure the estimated percent change is either negative or very small.

  2. The last row of each chart is the geomean. Ensure this percentage is either negative or very small.

What is this benchmarking?

The BenchmarkStartEndInvocation compares the amount of time it takes to call the start-invocation and end-invocation endpoints. For universal instrumentation languages (Dotnet, Golang, Java, Ruby), this represents the majority of the duration overhead added by our tracing layer.

The benchmark is run using a large variety of lambda request payloads. In the charts below, there is one row for each event payload type.

How do I interpret these charts?

The charts below comes from benchstat. They represent the statistical change in duration (sec/op), memory overhead (B/op), and allocations (allocs/op).

The benchstat docs explain how to interpret these charts.

Before the comparison table, we see common file-level configuration. If there are benchmarks with different configuration (for example, from different packages), benchstat will print separate tables for each configuration.

The table then compares the two input files for each benchmark. It shows the median and 95% confidence interval summaries for each benchmark before and after the change, and an A/B comparison under "vs base". ... The p-value measures how likely it is that any differences were due to random chance (i.e., noise). The "~" means benchstat did not detect a statistically significant difference between the two inputs. ...

Note that "statistically significant" is not the same as "large": with enough low-noise data, even very small changes can be distinguished from noise and considered statistically significant. It is, of course, generally easier to distinguish large changes from noise.

Finally, the last row of the table shows the geometric mean of each column, giving an overall picture of how the benchmarks changed. Proportional changes in the geomean reflect proportional changes in the benchmarks. For example, given n benchmarks, if sec/op for one of them increases by a factor of 2, then the sec/op geomean will increase by a factor of ⁿ√2.

I need more help

First off, do not worry if the benchmarks are failing. They are not tests. The intention is for them to be a tool for you to use during development.

If you would like a hand interpreting the results come chat with us in #serverless-agent in the internal DataDog slack or in #serverless in the public DataDog slack. We're happy to help!

Benchmark stats
goos: linux
goarch: amd64
pkg: github.com/DataDog/datadog-agent/pkg/serverless/daemon
cpu: AMD EPYC 7763 64-Core Processor                
                                      │ baseline/benchmark.log │       current/benchmark.log        │
                                      │         sec/op         │   sec/op     vs base               │
api-gateway-appsec.json                            86.89µ ± 3%   85.73µ ± 2%       ~ (p=0.247 n=10)
api-gateway-kong-appsec.json                       68.02µ ± 9%   66.61µ ± 2%  -2.06% (p=0.029 n=10)
api-gateway-kong.json                              65.80µ ± 4%   66.27µ ± 3%       ~ (p=0.529 n=10)
api-gateway-non-proxy-async.json                   103.6µ ± 2%   105.2µ ± 1%  +1.52% (p=0.011 n=10)
api-gateway-non-proxy.json                         105.3µ ± 1%   105.3µ ± 1%       ~ (p=0.971 n=10)
api-gateway-websocket-connect.json                 69.04µ ± 1%   70.32µ ± 1%  +1.85% (p=0.002 n=10)
api-gateway-websocket-default.json                 62.12µ ± 1%   63.66µ ± 1%  +2.48% (p=0.000 n=10)
api-gateway-websocket-disconnect.json              62.11µ ± 1%   64.10µ ± 2%  +3.20% (p=0.000 n=10)
api-gateway.json                                   112.5µ ± 3%   115.2µ ± 1%       ~ (p=0.052 n=10)
application-load-balancer.json                     62.81µ ± 1%   65.09µ ± 1%  +3.62% (p=0.000 n=10)
cloudfront.json                                    47.40µ ± 3%   48.84µ ± 2%  +3.05% (p=0.001 n=10)
cloudwatch-events.json                             37.82µ ± 3%   39.87µ ± 1%  +5.42% (p=0.000 n=10)
cloudwatch-logs.json                               65.06µ ± 2%   67.35µ ± 1%  +3.52% (p=0.000 n=10)
custom.json                                        30.42µ ± 2%   32.20µ ± 2%  +5.84% (p=0.000 n=10)
dynamodb.json                                      91.71µ ± 2%   93.29µ ± 1%  +1.72% (p=0.007 n=10)
empty.json                                         28.81µ ± 3%   30.33µ ± 2%  +5.26% (p=0.000 n=10)
eventbridge-custom.json                            46.88µ ± 2%   47.82µ ± 5%  +2.02% (p=0.002 n=10)
eventbridge-no-bus.json                            45.94µ ± 2%   47.49µ ± 2%  +3.38% (p=0.001 n=10)
eventbridge-no-timestamp.json                      46.12µ ± 1%   47.27µ ± 1%  +2.50% (p=0.000 n=10)
eventbridgesns.json                                62.69µ ± 2%   65.20µ ± 2%  +4.01% (p=0.000 n=10)
eventbridgesqs.json                                71.96µ ± 3%   72.92µ ± 1%       ~ (p=0.165 n=10)
http-api.json                                      73.93µ ± 3%   73.74µ ± 2%       ~ (p=0.190 n=10)
kinesis-batch.json                                 72.32µ ± 3%   71.48µ ± 2%       ~ (p=0.315 n=10)
kinesis.json                                       55.63µ ± 2%   54.83µ ± 2%  -1.43% (p=0.043 n=10)
s3.json                                            61.13µ ± 2%   60.91µ ± 1%       ~ (p=0.739 n=10)
sns-batch.json                                     91.54µ ± 2%   92.69µ ± 1%       ~ (p=0.105 n=10)
sns.json                                           68.12µ ± 2%   68.92µ ± 2%       ~ (p=0.165 n=10)
snssqs.json                                        119.2µ ± 2%   119.9µ ± 2%       ~ (p=0.247 n=10)
snssqs_no_dd_context.json                          107.5µ ± 2%   107.9µ ± 2%       ~ (p=0.631 n=10)
sqs-aws-header.json                                60.68µ ± 2%   59.70µ ± 2%       ~ (p=0.063 n=10)
sqs-batch.json                                     98.98µ ± 2%   96.31µ ± 2%  -2.69% (p=0.003 n=10)
sqs.json                                           73.02µ ± 4%   72.77µ ± 2%       ~ (p=0.579 n=10)
sqs_no_dd_context.json                             69.56µ ± 2%   68.85µ ± 3%       ~ (p=0.315 n=10)
stepfunction.json                                  47.78µ ± 3%   49.19µ ± 4%       ~ (p=0.143 n=10)
geomean                                            65.94µ        66.86µ       +1.39%

                                      │ baseline/benchmark.log │        current/benchmark.log        │
                                      │          B/op          │     B/op      vs base               │
api-gateway-appsec.json                           37.34Ki ± 0%   37.35Ki ± 0%  +0.04% (p=0.022 n=10)
api-gateway-kong-appsec.json                      26.93Ki ± 0%   26.95Ki ± 0%  +0.06% (p=0.006 n=10)
api-gateway-kong.json                             24.43Ki ± 0%   24.45Ki ± 0%  +0.08% (p=0.000 n=10)
api-gateway-non-proxy-async.json                  48.13Ki ± 0%   48.13Ki ± 0%       ~ (p=1.000 n=10)
api-gateway-non-proxy.json                        47.35Ki ± 0%   47.35Ki ± 0%       ~ (p=0.542 n=10)
api-gateway-websocket-connect.json                25.53Ki ± 0%   25.54Ki ± 0%       ~ (p=0.565 n=10)
api-gateway-websocket-default.json                21.44Ki ± 0%   21.44Ki ± 0%       ~ (p=0.158 n=10)
api-gateway-websocket-disconnect.json             21.22Ki ± 0%   21.22Ki ± 0%       ~ (p=0.127 n=10)
api-gateway.json                                  49.59Ki ± 0%   49.60Ki ± 0%       ~ (p=0.780 n=10)
application-load-balancer.json                    23.31Ki ± 0%   23.32Ki ± 0%       ~ (p=0.323 n=10)
cloudfront.json                                   17.67Ki ± 0%   17.69Ki ± 0%  +0.11% (p=0.006 n=10)
cloudwatch-events.json                            11.73Ki ± 0%   11.76Ki ± 0%  +0.23% (p=0.000 n=10)
cloudwatch-logs.json                              53.38Ki ± 0%   53.39Ki ± 0%  +0.02% (p=0.022 n=10)
custom.json                                       9.746Ki ± 0%   9.775Ki ± 0%  +0.30% (p=0.000 n=10)
dynamodb.json                                     40.81Ki ± 0%   40.82Ki ± 0%       ~ (p=0.075 n=10)
empty.json                                        9.294Ki ± 0%   9.339Ki ± 0%  +0.49% (p=0.000 n=10)
eventbridge-custom.json                           15.01Ki ± 0%   15.02Ki ± 0%       ~ (p=0.838 n=10)
eventbridge-no-bus.json                           13.99Ki ± 0%   14.00Ki ± 0%       ~ (p=0.197 n=10)
eventbridge-no-timestamp.json                     14.04Ki ± 0%   14.03Ki ± 0%       ~ (p=1.000 n=10)
eventbridgesns.json                               20.92Ki ± 0%   21.00Ki ± 0%  +0.40% (p=0.023 n=10)
eventbridgesqs.json                               25.17Ki ± 0%   25.17Ki ± 0%       ~ (p=0.516 n=10)
http-api.json                                     23.93Ki ± 0%   23.94Ki ± 0%       ~ (p=0.956 n=10)
kinesis-batch.json                                27.16Ki ± 0%   27.15Ki ± 0%       ~ (p=0.516 n=10)
kinesis.json                                      17.91Ki ± 0%   17.92Ki ± 0%       ~ (p=0.469 n=10)
s3.json                                           20.49Ki ± 0%   20.50Ki ± 1%       ~ (p=0.796 n=10)
sns-batch.json                                    39.87Ki ± 0%   39.99Ki ± 0%       ~ (p=0.143 n=10)
sns.json                                          25.15Ki ± 1%   25.19Ki ± 0%       ~ (p=0.218 n=10)
snssqs.json                                       53.87Ki ± 0%   53.94Ki ± 0%       ~ (p=0.353 n=10)
snssqs_no_dd_context.json                         47.68Ki ± 0%   47.64Ki ± 0%       ~ (p=0.160 n=10)
sqs-aws-header.json                               19.44Ki ± 1%   19.41Ki ± 1%       ~ (p=0.955 n=10)
sqs-batch.json                                    42.35Ki ± 1%   42.25Ki ± 0%       ~ (p=0.225 n=10)
sqs.json                                          26.30Ki ± 1%   26.25Ki ± 1%       ~ (p=0.280 n=10)
sqs_no_dd_context.json                            21.84Ki ± 1%   21.88Ki ± 1%       ~ (p=0.853 n=10)
stepfunction.json                                 14.29Ki ± 1%   14.29Ki ± 1%       ~ (p=0.543 n=10)
geomean                                           24.61Ki        24.62Ki       +0.07%

                                      │ baseline/benchmark.log │        current/benchmark.log        │
                                      │       allocs/op        │ allocs/op   vs base                 │
api-gateway-appsec.json                             629.0 ± 0%   630.5 ± 0%  +0.24% (p=0.001 n=10)
api-gateway-kong-appsec.json                        488.0 ± 0%   489.0 ± 0%  +0.20% (p=0.000 n=10)
api-gateway-kong.json                               466.0 ± 0%   467.0 ± 0%  +0.21% (p=0.000 n=10)
api-gateway-non-proxy-async.json                    725.0 ± 0%   725.0 ± 0%       ~ (p=1.000 n=10)
api-gateway-non-proxy.json                          716.0 ± 0%   716.0 ± 0%       ~ (p=1.000 n=10)
api-gateway-websocket-connect.json                  453.0 ± 0%   453.0 ± 0%       ~ (p=1.000 n=10) ¹
api-gateway-websocket-default.json                  379.0 ± 0%   379.0 ± 0%       ~ (p=1.000 n=10)
api-gateway-websocket-disconnect.json               370.0 ± 0%   370.0 ± 0%       ~ (p=0.582 n=10)
api-gateway.json                                    791.0 ± 0%   791.0 ± 0%       ~ (p=1.000 n=10)
application-load-balancer.json                      353.0 ± 0%   353.0 ± 0%       ~ (p=1.000 n=10) ¹
cloudfront.json                                     284.0 ± 0%   285.0 ± 0%  +0.35% (p=0.000 n=10)
cloudwatch-events.json                              220.0 ± 0%   221.0 ± 0%  +0.45% (p=0.000 n=10)
cloudwatch-logs.json                                215.0 ± 0%   216.0 ± 0%  +0.47% (p=0.002 n=10)
custom.json                                         168.0 ± 0%   169.0 ± 1%  +0.60% (p=0.000 n=10)
dynamodb.json                                       589.0 ± 0%   590.0 ± 0%  +0.17% (p=0.001 n=10)
empty.json                                          159.0 ± 1%   161.0 ± 0%  +1.26% (p=0.000 n=10)
eventbridge-custom.json                             266.0 ± 0%   266.0 ± 0%       ~ (p=0.303 n=10)
eventbridge-no-bus.json                             257.0 ± 0%   257.5 ± 0%       ~ (p=0.450 n=10)
eventbridge-no-timestamp.json                       258.0 ± 0%   258.0 ± 0%       ~ (p=1.000 n=10)
eventbridgesns.json                                 325.0 ± 0%   326.0 ± 0%       ~ (p=0.082 n=10)
eventbridgesqs.json                                 367.0 ± 0%   367.0 ± 0%       ~ (p=0.966 n=10)
http-api.json                                       434.0 ± 0%   434.0 ± 0%       ~ (p=0.746 n=10)
kinesis-batch.json                                  392.0 ± 0%   393.0 ± 0%       ~ (p=0.082 n=10)
kinesis.json                                        286.0 ± 0%   287.0 ± 0%  +0.35% (p=0.003 n=10)
s3.json                                             359.0 ± 1%   360.0 ± 1%       ~ (p=0.337 n=10)
sns-batch.json                                      477.5 ± 1%   479.5 ± 0%       ~ (p=0.123 n=10)
sns.json                                            346.5 ± 1%   347.0 ± 1%       ~ (p=0.287 n=10)
snssqs.json                                         478.0 ± 1%   479.0 ± 0%       ~ (p=0.288 n=10)
snssqs_no_dd_context.json                           438.0 ± 1%   438.0 ± 0%       ~ (p=0.673 n=10)
sqs-aws-header.json                                 286.0 ± 1%   285.5 ± 1%       ~ (p=0.922 n=10)
sqs-batch.json                                      517.0 ± 1%   516.0 ± 1%       ~ (p=0.320 n=10)
sqs.json                                            365.5 ± 1%   364.0 ± 1%       ~ (p=0.314 n=10)
sqs_no_dd_context.json                              348.5 ± 1%   350.0 ± 1%       ~ (p=0.534 n=10)
stepfunction.json                                   237.5 ± 1%   237.0 ± 0%       ~ (p=0.408 n=10)
geomean                                             367.0        367.6       +0.16%
¹ all samples are equal

@agocs
Copy link
Contributor Author

agocs commented Nov 7, 2024

/merge

@dd-devflow
Copy link

dd-devflow bot commented Nov 7, 2024

Devflow running: /merge

View all feedbacks in Devflow UI.


2024-11-07 20:57:42 UTC ℹ️ MergeQueue: pull request added to the queue

The median merge time in main is 24m.

@dd-mergequeue dd-mergequeue bot merged commit 5cfbd26 into main Nov 7, 2024
222 of 223 checks passed
@dd-mergequeue dd-mergequeue bot deleted the chris.agocs/read-128-bit-trace-id-in-end-invocation branch November 7, 2024 21:17
@github-actions github-actions bot added this to the 7.61.0 milestone Nov 7, 2024
@agocs agocs added the qa/done QA done before merge and regressions are covered by tests label Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
medium review PR review might take time qa/done QA done before merge and regressions are covered by tests team/serverless
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants