Field-level metrics and SLOs #3602
aryascripts
started this conversation in
Feature request
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Why
While monitoring operations is a great start at monitoring our GQL Server for inefficiencies, the main culprits for any performance issues lie with individual fields that can degrade over time and can be individually responsible for breaching SLOs of particular operations.
The page for individual fields only mentions Total calls and RPM which can become time consuming to find inefficiencies in the operations and in our schema.
What
GraphQL Hive should report SLO metrics for fields that are in the main or different sub-schema, so we can monitor if a particular field in a stitched-in schema (called Service in Hive’s UI) is the culprit of any performance issues.
Currently, Hive has great support for metrics for unique operations that includes the following; however, these metrics are NOT available for individual fields.
We want to see the following for fields in our graph schema:
Additionally, if possible, we would also want to know how long each field took, when looking at an individual operation.
Seems like with Apollo Studio / Graph OS, we can sort fields by latency or error rates.
Is this something possible with supported GraphQL servers and Hive together?
Example
Given a page for a particular operation with fields that can come from a different sub-schema.
cart
comes from cart-service (stitched)user
comes from user-service (stitched)It would be helpful to see the following information, which helps us see that
cart
in particular is taking the longest for this operation.p50: 50ms
,p90: 100ms
,p99: 200ms
p50: 10ms
,p90: 20ms
,p99: 40ms
p50: 5ms
,p90: 8ms
,p99: 12ms
Beta Was this translation helpful? Give feedback.
All reactions