add `http.request.synthetic` attribute to server spans and metrics #1523

JacksonWeber · 2024-10-28T20:56:03Z

Changes

Adds the http.request.synthetic attribute to server spans and metrics to track if telemetry is generated from a synthetic testing source, some kind of bot, or a real user. This will help users filter out telemetry generated by these kinds of synthetic sources that they may not want to consider.

Note: if the PR is touching an area that is not listed in the existing areas, or the area does not have sufficient domain experts coverage, the PR might be tagged as experts needed and move slowly until experts are identified.

Merge requirement checklist

CONTRIBUTING.md guidelines followed.
Change log entry added, according to the guidelines in When to add a changelog entry.
- If your PR does not need a change log, start the PR title with [chore]
schema-next.yaml updated with changes to existing conventions.

model/http/registry.yaml

Co-authored-by: Trask Stalnaker <[email protected]>

.chloggen/add-synthetic-source.yaml

model/user-agent/registry.yaml

lmolkova · 2024-11-01T03:43:10Z

model/user-agent/registry.yaml

+        brief: >
+          A flag indicating that the user agent represents a synthetic source and did not originate from genuine client traffic.
+        note: >
+          This flag can primarily be determined by the contents of the `user_agent.original` attribute. Instrumentations should determine what they consider synthetic or bot traffic,


It there any prior art we can refer to? E.g. a well-known database of crawlers/bots?

There are a number sites that maintain lists of the most popular. So far, the best I've found is Data Dome's list of those most popular in 2024 https://datadome.co/bot-management-protection/crawlers-list/.

Co-authored-by: Liudmila Molkova <[email protected]>

…ksonWeber/semantic-conventions into jacksonweber-sythetic-source

jsuereth · 2024-11-04T19:28:29Z

model/user-agent/registry.yaml

@@ -35,3 +35,20 @@ groups:
          using a user-agent for non-browser products, such as microservices with multiple names/versions inside the
          `user_agent.original`, the most significant version SHOULD be selected. In such a scenario it should align
          with `user_agent.name`
+      - id: user_agent.synthetic.type


Asked this in chat, but also asking here.

Should this (also) be added to client spans so synthetic agents can self-identify in a trace?

e.g. https://opentelemetry.io/blog/2023/synthetic-testing/

Makes sense to me to allow agents to self-identify and allow for propagation of user_agent.synthetic.type to any server spans created in response to the remote client span from the synthetic agent.

these sound like two different (though potentially both useful) things:

allow agents to self-identify

and

allow for propagation of user_agent.synthetic.type to any server spans created in response to the remote client span from the synthetic agent

I'd suggest sticking to just the first in this PR

Add synthetic source.

e579479

JacksonWeber requested review from a team as code owners October 28, 2024 20:56

trask reviewed Oct 28, 2024

View reviewed changes

model/http/registry.yaml Outdated Show resolved Hide resolved

model/http/registry.yaml Outdated Show resolved Hide resolved

model/http/registry.yaml Outdated Show resolved Hide resolved

JacksonWeber and others added 4 commits October 29, 2024 10:23

Update model/http/registry.yaml

984fbee

Co-authored-by: Trask Stalnaker <[email protected]>

Update model/http/registry.yaml

4d1b110

Co-authored-by: Trask Stalnaker <[email protected]>

Begin refactoring.

f925b65

Update md files.

6b48773

lmolkova reviewed Nov 1, 2024

View reviewed changes

JacksonWeber and others added 5 commits November 1, 2024 10:04

Clarify synthetic meaning.

5837106

Co-authored-by: Liudmila Molkova <[email protected]>

Shorten id value.

b4686ff

Co-authored-by: Liudmila Molkova <[email protected]>

Update user_agent synthetic value.

568cefc

Merge branch 'jacksonweber-sythetic-source' of https://github.com/Jac…

d52bde9

…ksonWeber/semantic-conventions into jacksonweber-sythetic-source

Update docs.

40a7d3b

jsuereth reviewed Nov 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add `http.request.synthetic` attribute to server spans and metrics #1523

add `http.request.synthetic` attribute to server spans and metrics #1523

JacksonWeber commented Oct 28, 2024 •

edited by lmolkova

Loading

lmolkova Nov 1, 2024

JacksonWeber Nov 4, 2024

jsuereth Nov 4, 2024

JacksonWeber Nov 4, 2024

trask Nov 5, 2024

add http.request.synthetic attribute to server spans and metrics #1523

Are you sure you want to change the base?

add http.request.synthetic attribute to server spans and metrics #1523

Conversation

JacksonWeber commented Oct 28, 2024 • edited by lmolkova Loading

Changes

Merge requirement checklist

lmolkova Nov 1, 2024

Choose a reason for hiding this comment

JacksonWeber Nov 4, 2024

Choose a reason for hiding this comment

jsuereth Nov 4, 2024

Choose a reason for hiding this comment

JacksonWeber Nov 4, 2024

Choose a reason for hiding this comment

trask Nov 5, 2024

Choose a reason for hiding this comment

add `http.request.synthetic` attribute to server spans and metrics #1523

add `http.request.synthetic` attribute to server spans and metrics #1523

JacksonWeber commented Oct 28, 2024 •

edited by lmolkova

Loading