Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add http.request.synthetic attribute to server spans and metrics #1523

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

JacksonWeber
Copy link

@JacksonWeber JacksonWeber commented Oct 28, 2024

Fixes #1127

Changes

Adds the http.request.synthetic attribute to server spans and metrics to track if telemetry is generated from a synthetic testing source, some kind of bot, or a real user. This will help users filter out telemetry generated by these kinds of synthetic sources that they may not want to consider.

Note: if the PR is touching an area that is not listed in the existing areas, or the area does not have sufficient domain experts coverage, the PR might be tagged as experts needed and move slowly until experts are identified.

Merge requirement checklist

@JacksonWeber JacksonWeber requested review from a team as code owners October 28, 2024 20:56
model/http/registry.yaml Outdated Show resolved Hide resolved
model/http/registry.yaml Outdated Show resolved Hide resolved
model/http/registry.yaml Outdated Show resolved Hide resolved
.chloggen/add-synthetic-source.yaml Outdated Show resolved Hide resolved
.chloggen/add-synthetic-source.yaml Outdated Show resolved Hide resolved
model/user-agent/registry.yaml Outdated Show resolved Hide resolved
model/user-agent/registry.yaml Outdated Show resolved Hide resolved
model/user-agent/registry.yaml Outdated Show resolved Hide resolved
brief: >
A flag indicating that the user agent represents a synthetic source and did not originate from genuine client traffic.
note: >
This flag can primarily be determined by the contents of the `user_agent.original` attribute. Instrumentations should determine what they consider synthetic or bot traffic,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It there any prior art we can refer to? E.g. a well-known database of crawlers/bots?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a number sites that maintain lists of the most popular. So far, the best I've found is Data Dome's list of those most popular in 2024 https://datadome.co/bot-management-protection/crawlers-list/.

@@ -35,3 +35,20 @@ groups:
using a user-agent for non-browser products, such as microservices with multiple names/versions inside the
`user_agent.original`, the most significant version SHOULD be selected. In such a scenario it should align
with `user_agent.name`
- id: user_agent.synthetic.type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asked this in chat, but also asking here.

Should this (also) be added to client spans so synthetic agents can self-identify in a trace?

e.g. https://opentelemetry.io/blog/2023/synthetic-testing/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me to allow agents to self-identify and allow for propagation of user_agent.synthetic.type to any server spans created in response to the remote client span from the synthetic agent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these sound like two different (though potentially both useful) things:

allow agents to self-identify

and

allow for propagation of user_agent.synthetic.type to any server spans created in response to the remote client span from the synthetic agent

I'd suggest sticking to just the first in this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

Introduction of a Synthetic Attribute for Server Span Telemetry
4 participants