Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Lambda PassThrough trace header propagation #409

Merged
merged 4 commits into from
Aug 7, 2024

Conversation

majanjua-amzn
Copy link
Contributor

@majanjua-amzn majanjua-amzn commented Aug 1, 2024

Issue

Lambda is implementing PassThrough mode which allows customers to send traces through instrumented lambda functions without those functions being traced. The way this is done is by changing the trace header structure from Root=...;Parent=...;Sampled=1 in the active tracing case to Root=... in the PassThrough case. As such, our SDK needs to be updated to handle this.

Changes

  • Update logic to use no-op segment if parent and sampled are missing (ie. PassThrough mode)
    • Had to use SuppressWarnings("nullness") because TraceHeader.getRootTraceId() returns an @Nullable TraceId but Segment.noOp() expects a @NonNull TraceId
  • Update all trace header propagations to only propagate Root if we're not actively sampling
    • Not propagating Parent and Sampled is fine because Parent will be re-generated in the next traced context and missing Sampled will be treated as not actively sampling anyways
    • We can't propagate only one of these, it needs to be either both or neither
  • Updated unit tests

Testing

Ran many extensive tests using v2.16.0, v2.17.0, and these changes to compare the outcomes. In these tests I configured 3 lambda functions to interact with each other ALL instrumented with the new changes:

  • Lambda A: Calls Lambda B
  • Lambda B: Calls Lambda C
  • Lambda C: Calls GetAccountSettings

The trace headers in the below cases are from logging System.getenv("_X_AMZN_TRACE_ID") in the lambda functions themselves

The following are some important cases that prove the functionality of the code change

Case 1.a: Lambda A Active → Lambda B PassThrough → Lambda C Active

  • Lineage is NOT propagated or received by any of these functions
  • Lambda B has an inferred node due to the invocation in Lambda A which is actively traced, this is expected behaviour and not controlled by the SDK
  • Lambda B function is being appropriately suppressed aside from the previous note
  • Lambda C invocation is appropriately connected to the Lambda A function
A: Root=1-66abd797-167a78c02dcfeb4551890716;Parent=ad075ec25766d3f9;Sampled=1;Lineage=456f6a97:0
B: Root=1-66abd797-167a78c02dcfeb4551890716;Parent=57696313aea6f4a2;Sampled=1;Lineage=8408dcb0:0
C: Root=1-66abd797-167a78c02dcfeb4551890716;Parent=710d6393cf68bccc;Sampled=1;Lineage=7d2b6fd0:0

Image

Case 1.b: Same thing but Lambda A Active sets Sampled=0

  • Lineage is NOT propagated or received by any of the functions
  • No trace is generated
  • Lambda A, B, and C all have Sampled=0
  • Lambda C appropriately changes Parent as expected, as it is actively traced, but still maintains the Sampled=0
A: Root=1-66b32340-6a4114ff44ca3e5240b1ac5b;Parent=63033870537134cd;Sampled=0;Lineage=456f6a97:0
B: Root=1-66b32340-6a4114ff44ca3e5240b1ac5b;Parent=63033870537134cd;Sampled=0;Lineage=8408dcb0:0
C: Root=1-66b32340-6a4114ff44ca3e5240b1ac5b;Parent=71d11ee442af5530;Sampled=0;Lineage=7d2b6fd0:0

Case 2: Lambda A PassThrough → Lambda B Active → Lambda C PassThrough

  • Lineage is NOT propagated or received by any of these functions
  • Lambda B is the start of the trace as expected
  • Lambda C invocation is still attached to Lambda B as an inferred node (similar to case 1)
  • Lambda C's GetAccountSettings call IS TRACED; this is expected behaviour because we entered Active tracing mode in Lambda B and this is meant to propagate to downstream services. If the downstream service has its own version of passive tracing/X-Ray integration, it should be configured as such, but PassThrough functionality should only apply to Lambda functions and not the downstream calls they trigger unless the entire pathway has been in PassThrough
A: Root=1-66b3b030-7155e4f41eac6ff601dfaf63;Lineage=456f6a97:0
B: Root=1-66b3b030-7155e4f41eac6ff601dfaf63;Parent=ad139ed6effb18d2;Sampled=1;Lineage=8408dcb0:0
C: Root=1-66b3b030-7155e4f41eac6ff601dfaf63;Parent=50262dfb251d8754;Sampled=1;Lineage=7d2b6fd0:0

Image (1)

Case 3.a: Calling Lambda B PassThrough → Lambda C PassThrough

  • Lineage not propagated
  • No trace, proves statement above that a complete trace entirely in PassThrough mode will not trace the final GetAccountSettings call since we never entered Active tracing mode.
B: Root=1-66abfc06-3cbd74727bb2f86a6a772af2;Lineage=8408dcb0:0
C: Root=1-66abfc06-3cbd74727bb2f86a6a772af2;Lineage=7d2b6fd0:0

Case 3.b: Calling Lambda B Active → Lambda C Active

  • Lineage not propagated
  • Complete and correct trace
B: Root=1-66abfd06-5e95193a3dcfa6a25cec5b30;Parent=11c07dc8ed123be7;Sampled=1;Lineage=8408dcb0:0
C: Root=1-66abfd06-5e95193a3dcfa6a25cec5b30;Parent=613a4ff907527f30;Sampled=1;Lineage=7d2b6fd0:0
Screenshot 2024-08-01 at 2 25 21 PM

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@majanjua-amzn majanjua-amzn requested a review from a team as a code owner August 1, 2024 20:54
Copy link

@alexey-katranov alexey-katranov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, however, I am not sure if Lineage is properly handled.

@wangzlei wangzlei merged commit 3c15d40 into aws:master Aug 7, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants