-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref(eap): Update interface of eap mutations #335
base: main
Are you sure you want to change the base?
Conversation
versions in use: The following repositories use one of the schemas you are editing. It is recommended to roll out schema changes in small PRs, meaning that if those used versions lag behind the latest, it is probably best to update those services before rolling out your change.
latest version: 0.1.113 changes considered breakingschemas/snuba-eap-mutations.v1.schema.json
benign changesschemas/snuba-eap-mutations.v1.schema.json
|
@@ -1,7 +1,7 @@ | |||
{ | |||
"filter": { | |||
"organization_id": 1500, | |||
"_sort_timestamp": 150, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why we had to make this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not strictly necessary, but now there is just one place that computes _sort_timestamp (the snuba consumers), instead of whichever users of the mutations platform we have. _sort_timestamp
is an attribute derived from start_timestamp_ms
and IMO should've never been exposed outside of snuba in the first place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check https://github.com/getsentry/snuba/pull/6344/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to be a little careful here: _sort_timestamp
exists so to abstract which timestamp values we use to sort the data. So it might not always be start timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@evanh I think we have to either expose _sort_timestamp or start_timestamp_ms to users of the mutation interface. But there's no rush to make this interface change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should only expose start_timestamp
to users of the mutation interface and note that it's meant to be a microsecond precision timestamp, not milliseconds. And then, we can round it in the consumer to whatever the _sort_timestamp
rounds it to (right now it rounding to the second precision).
We do this to be able to sort spans by start_timestamp
and have the appropriate order since some spans can be started in fast succession during the same milliseconds. That's also why we store it in ClickHouse as a DateTime64
.
Also, we use a float64
in the span, as a timestamp in seconds with microsecond precision, why not keep that? ClickHouse supports ingestion of this timestamp into a DateTime64
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is how start timestamp is represented in a span on kafka: https://github.com/getsentry/snuba/blob/a612e70b017a95969a6f6f21332952cf5d386b21/rust_snuba/src/processors/eap_spans.rs#L300
intended to match that
what is the interface supposed to be for EAP then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to be a little careful here:
_sort_timestamp
exists so to abstract which timestamp values we use to sort the data. So it might not always be start timestamp.
While it's true, I think there's a higher chance we move towards sending incomplete spans from the SDKs and having to update end_timestamp
, which makes it not viable for the primary key.
@@ -1,7 +1,7 @@ | |||
{ | |||
"filter": { | |||
"organization_id": 1500, | |||
"_sort_timestamp": 150, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to be a little careful here: _sort_timestamp
exists so to abstract which timestamp values we use to sort the data. So it might not always be start timestamp.
getsentry/snuba#6344