-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc(decision): Revamping the SDK Performance API #101
Conversation
ecb2dbc
to
ed60264
Compare
1. Get the active span from the current scope | ||
2. If the active span is defined, create a child span of that active span based on the provided `spanContext`, otherwise create a transaction based on the provided `spanContext`. | ||
3. Run the provided callback | ||
4. Finish the child span/transaction created in step 2 | ||
5. Remove the child span/transaction from the current scope and if it exists, set the previous active span as the active span in the current scope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this solves (or not) the use case that you have an active span (span A), and spawn multiple threads to download something, and each of those threads will start a new span (span B and C), but then the relationship of the spans are messed up because they "race each other", for example, span B and C should be a child of span A, but what I see is that span C is a child of span B.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the caveat here is that the hub/scope needs to be properly forked and correctly follow the execution context.
I guess in single scope instances, this logic breaks down re: parent-child relationships. Let me amend the RFC to reference this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, Mobile is mostly since scope instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defining a root span for mobile/single-scope SDKs sounds good. It doesn't solve the parent-child relationships, but keeps the behaviour the same as with Transactions and Span.
|
||
| Span Field | Description | Type | Notes | | ||
| ----------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | ||
| `name` | The name of the span | String | Should be low cardinality. Replacing span `description` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe is important to keep description
, how would I set the SQL statement of a DB query if it has to be low cardinality? One can say you can do it under the data
field, so the name
then would be similar to the former operation
? some operations just don't have a unique name, is the grouping of transactions still per name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the OpenTelemetry conventions, it's recommended db spans be named <db.operation> <db.name>.<db.sql.table> - like SELECT bestDb.orders
, but the only thing they require is that something is low cardinality, so the db query also works here. In most cases, we can just add whatever we were setting in span.description
to span.name
(as long as it's low cardinality).
Grouping of transactions/spans would still be by name, but there is more freedom about what that should be. The product instead relies on span attributes to make this work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, DB queries are high cardinality though, at least for bigger databases.
Most of the time interceptors/hooks just give you a String
, so to build SELECT bestDb.orders
requires parsing the String on the client which is also not ideal.
Alright, this is more an implementation detail anyway, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK removing op
and adding a new low cardinality format of OpenTelemetry as you described @AbhiPrasad, but I think the description we currently have for db queries, HTTP requests, and such is highly useful. Where would that end up now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@philipphofmann the span attributes, aka span.data
|
||
Having both the `name` and `op` fields is redundant, but we choose to keep both for backwards compatibility. There are many features in the product that are powered by having a meaningful operation, more details about this can be found in the documentation around [Span operations](https://develop.sentry.dev/sdk/performance/span-operations/). In the future, we can choose to deprecate `op` and remove it from the schema. | ||
|
||
The most notable change here is to formally introduce the `attributes` field, and remove the `span.data` field. This is a breaking change, but worth it in the long-term. If we start accepting `attributes` on transactions as well, we more closely align with the OpenTelemetry schema, and can use the same conventions for both spans and transactions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep the data
that is gonna be deprecated and forwarded to attributes
for compatibility?
eg
span.setData(x, y)
just calls span.setAttribute(x, y)
internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup this is a good idea! I will also put more details into migrating between the different span schemas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent that you try to address some critical issues of our performance API @AbhiPrasad 👏. I gave this a first quick pass, but I'll have to take some more time after my vacation next week and think about how the new API will play out on mobile.
|
||
This allows us to de-emphasize the concept of hubs, scopes and transactions from users, and instead have them think about just spans. Under the hood, `Sentry.startActiveSpan` and `Sentry.startSpan` should create transactions/spans as appropriate. `Sentry.setMeasurement` is used to abstract away `transaction.setMeasurement` and similar. | ||
|
||
1. Introduce a new span schema that is aligned with OpenTelemetry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that should be step 2.
1. Introduce a new span schema that is aligned with OpenTelemetry. | |
2. Introduce a new span schema that is aligned with OpenTelemetry. |
|
||
We change the data model that is referenced and used internally inside the SDK to better reflect OpenTelemetry. This involves adding shims for backwards compatibility and removing redundant fields. | ||
|
||
# Background |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice summary of the biggest problems 👍
1. Newly created spans must have the correct trace and parent/child relationship | ||
2. Users shouldn’t be burdened with knowing if something is a span/transaction | ||
3. Spans only need a name to identify themselves, everything else is optional. | ||
4. The new top level APIs should be as similar to the OpenTelemetry SDK public API as possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this already an official decision that Sentry decided to align more with OpenTelemetry? If yes, please link the decision or elaborate somewhere in the document on why we want to achieve this, as I think it might not be 100% clear to everyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be interested in that as well, what are the pro x cons of moving closer to opentelemetry rather than our own model?
I know a few of them but it'd be nice to be written down.
// declare function startActiveSpanAsync<T>( | ||
// spanContext: SpanContext, | ||
// callback: (span: Span) => Promise<T>, | ||
// ): Promise<T>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that should not be commented out
// declare function startActiveSpanAsync<T>( | |
// spanContext: SpanContext, | |
// callback: (span: Span) => Promise<T>, | |
// ): Promise<T>; | |
declare function startActiveSpanAsync<T>( | |
spanContext: SpanContext, | |
callback: (span: Span) => Promise<T>, | |
): Promise<T>; |
3. Spans only need a name to identify themselves, everything else is optional. | ||
4. The new top level APIs should be as similar to the OpenTelemetry SDK public API as possible. | ||
|
||
There are two top level methods we'll be introducing to achieve this: `Sentry.startActiveSpan` and `Sentry.startSpan`. `Sentry.startActiveSpan` will take a callback and start/stop a span automatically. In addition, it'll also set the span on the current scope. Under the hood, the SDK will create a transaction or span based on if there is already an existing span on the scope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a bit confused by the difference of startActiveSpan
and startSpan
when reading it for the first time. Even after reading the explanation, I'm still unhappy with the name. The name states to start a span, but the method takes a lamba and starts and finishes the span. I think the name should reflect that, so it should be something like executeSpan
, createSpan
, etc.
Furthermore, I'd like to have the flexibility of finishing the span manually and achieving the same behavior you described with startActiveSpan
or will you still be able to do startSpan(bindtoScope: true)
? If yes, then it's a bit confusing to me to achieve the same behavior with either startActiveSpan
plus a lambda and startSpan(bindtoScope: true)
.
I'm sorry I didn't come up with proposals for solutions yet. That is not an easy problem to solve and I have to take some time to think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@philipphofmann If I got it right, startActiveSpan
is the equivalent of bindtoScope: true
, and startSpan
the equivalent of bindtoScope: false
, hence the active
naming, because it is the one active in the scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Active for active in the scope
in the scope make sense, but since we try to shield users from the internal scope and hub.
startActiveSpan
looks to me as active
for the automatic finish.
startSpan
as a manual variant of the startAtiveSpan
|
||
To simplify how performance data is consumed and understood, we are proposing a new span schema that the SDKs send to Sentry. The new span schema aims to be a superset of the [OpenTelemetry span schema](https://github.com/open-telemetry/opentelemetry-proto/blob/4967b51b5cb29f725978362b9d413bae1dbb641c/opentelemetry/proto/trace/v1/trace.proto) and have a minimal top level API surface. This also means that spans can be easily converted to OpenTelemetry spans and vice versa. | ||
|
||
The new span schema is as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No tags for spans?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today spans have tags, but they are not indexed, so they can just be span attributes.
But I'd love to see indexed fields from spans as well, I believe this is important.
|
||
| Span Field | Description | Type | Notes | | ||
| ----------------- | ------------------------------------------------------------ | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | ||
| `name` | The name of the span | String | Should be low cardinality. Replacing span `description` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK removing op
and adding a new low cardinality format of OpenTelemetry as you described @AbhiPrasad, but I think the description we currently have for db queries, HTTP requests, and such is highly useful. Where would that end up now?
|
||
To simplify how performance data is consumed and understood, we are proposing a new span schema that the SDKs send to Sentry. The new span schema aims to be a superset of the [OpenTelemetry span schema](https://github.com/open-telemetry/opentelemetry-proto/blob/4967b51b5cb29f725978362b9d413bae1dbb641c/opentelemetry/proto/trace/v1/trace.proto) and have a minimal top level API surface. This also means that spans can be easily converted to OpenTelemetry spans and vice versa. | ||
|
||
The new span schema is as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also keep origin
https://develop.sentry.dev/sdk/performance/trace-origin/? It's handy to know what precisely created a span.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
origin
can also be part of the span attributes if needed, but good point.
|
||
# Background | ||
|
||
Right now every SDK has both the concept of transactions and spans - and to a user they both exist as vehicles of performance data. In addition, the transaction exists as the carrier of distributed tracing information ([dynamic sampling context](https://develop.sentry.dev/sdk/performance/dynamic-sampling-context/) and [sentry-trace info](https://develop.sentry.dev/sdk/performance/#header-sentry-trace)), although this is going to change with the advent of tracing without performance support in the SDKs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the plan with transaction-specific information that doesn't belong to dynamic sampling or tracing? Is it going to be duplicated in every span? Tags, for example.
| `attributes` | A set of attributes on the span. | Object | This maps to `span.data` in the current schema for spans. There is no existing mapping for this in the current transaction schema. The keys of attributes are well known values, and defined by a combination of OpenTelemtry's and Sentry's semantic conventions. | | ||
| `measurements` | Measurements which holds observed values such as web vitals. | Object | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a good time to structure the data further and define what Object
looks like for each field? The more we structure the data, the better product experience we can provide on the server side.
|
||
1. Introduce three new top level methods, `Sentry.startActiveSpan` and `Sentry.startSpan` and their language specific variants as well as `Sentry.setMeasurement`. | ||
|
||
This allows us to de-emphasize the concept of hubs, scopes and transactions from users, and instead have them think about just spans. Under the hood, `Sentry.startActiveSpan` and `Sentry.startSpan` should create transactions/spans as appropriate. `Sentry.setMeasurement` is used to abstract away `transaction.setMeasurement` and similar. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sentry.startActiveSpan
should be the default, hence called Sentry.startSpan
.
Introducing more top-level functions (even undocumented) increases the API surface unnecessarily and adds more confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we turn it around - instead of having startActiveSpan
we just have startSpan
that does by default what startActiveSpan
does (meaning putting it on the scope) - I suppose this is what people would expect 99% of the time. We could add an argument that prevents setting it on the scope if someone sets it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what we do on Mobile, startTransaction
with a bindToScope
param, but it's bindToScope=false
by default.
If this flag is enabled by default, because of single Hub/Scope on Mobile, might be overwriting each other transactions as a side-effect, not sure if it's a good idea, at least on the Mobile world, maybe keeping bindToScope=false
by default makes more sense in this case.
The other option is that if it's enabled by default, but there's a living transaction in the scope, you just create a span, but maybe this is not the desired behavior and we'd need a way to opt-out, in this case, people would need to force bindToScope=false
.
Ergonomics isn't great again because they will not know if there's a living transaction in the scope.
|
||
`startActiveSpan` only provides the correct parent-child relationship if your platform has proper support for forking scopes. For platforms have a single hub/scope (like the mobile SDKs), this method will not lead to the correct parent-child relationship. The SDK will have to provide a different method for these platforms. One option is for `startActiveSpan` to always attach the span it creates to the root span (the transaction), which means users don't get exact parent-child relationships, but they do get relative relationships between spans using relative durations. | ||
|
||
`Sentry.startSpan` will create a span, but not set it as the active span in the current scope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any use-cases come to mind why this function is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mobile has single Hub/Scope and this is needed in case you want your own transaction instead of the auto-instrumented ones, measuring something side by side.
For example, a auto-instrumented transaction during screen loading, but also a service doing the X job, that has nothing to do with the screen loading transaction.
|
||
The most notable change here is to formally introduce the `attributes` field, and remove the `span.data` field. This is a breaking change, but worth it in the long-term. If we start accepting `attributes` on transactions as well, we more closely align with the OpenTelemetry schema, and can use the same conventions for both spans and transactions. | ||
|
||
### Shimming the old schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we going to do with the extra
stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b760c5d
to
136d5d8
Compare
136d5d8
to
db96510
Compare
After a lot of deliberation finally pushed updates to this RFC. We've moved the span schema changes out of scope of this RFC as this requires further discussion in ingest/product. As a result the RFC now only focuses on introducing the new top level methods for the performance API. The RFC has also been amended to give more freedom to SDK authors to decide what methods to best introduce, although the recommendation is to use // golang
ctx, span := tracer.Start // .NET
using var myActivity = MyActivitySource.StartActivity # ruby
MyAppTracer.in_span("do_work") do |span|
# do some work that the 'do_work' span tracks!
end # Java
Span childSpan = tracer.spanBuilder("child")
.setParent(Context.current().with(parentSpan))
.startSpan(); |
Rendered RFC
closes #28
This RFC proposes to revamp the performance API in the SDKs. The new API aims to accomplish the following:
The RFC proposal comes in two steps.
Introduce three new top level methods,
Sentry.startActiveSpan
andSentry.startSpan
and their language specific variants as well asSentry.setMeasurement
.Introduce a new span schema that is aligned with OpenTelemetry.Out of scope for this RFC, will be addressed later.