v1.50.0
🚀 Features
Support local persisted query manifests for use with offline licenses (Issue #4587)
Adds experimental support for passing persisted query manifests to use instead of the hosted Uplink version.
For example:
persisted_queries:
enabled: true
log_unknown: true
experimental_local_manifests:
- ./persisted-query-manifest.json
safelist:
enabled: true
require_id: false
Support conditions on standard telemetry events (Issue #5475)
Enables setting conditions on standard events.
For example:
telemetry:
instrumentation:
events:
router:
request:
level: info
condition: # Only log the router request if you sent `x-log-request` with the value `enabled`
eq:
- request_header: x-log-request
- "enabled"
response: off
error: error
# ...
Not supported for batched requests.
By @bnjjj in #5476
Make status_code
available for router_service
responses in Rhai scripts (Issue #5357)
Adds response.status_code
on Rhai router_service
responses. Previously, status_code
was only available on subgraph_service
responses.
For example:
fn router_service(service) {
let f = |response| {
if response.is_primary() {
print(response.status_code);
}
};
service.map_response(f);
}
By @IvanGoncharov in #5358
Add new values for the supergraph query
selector (PR #5433)
Adds support for four new values for the supergraph query
selector:
aliases
: the number of aliases in the querydepth
: the depth of the queryheight
: the height of the queryroot_fields
: the number of root fields in the query
You can use this data to understand how your graph is used and to help determine where to set limits.
For example:
telemetry:
instrumentation:
instruments:
supergraph:
'query.depth':
description: 'The depth of the query'
value:
query: depth
unit: unit
type: histogram
Add the ability to drop metrics using otel views (PR #5531)
You can drop specific metrics if you don't want these metrics to be sent to your APM using otel views.
telemetry:
exporters:
metrics:
common:
service_name: apollo-router
views:
- name: apollo_router_http_request_duration_seconds # Instrument name you want to edit. You can use wildcard in names. If you want to target all instruments just use '*'
aggregation: drop
Add operation_name
selector for router service in custom telemetry (PR #5392)
Adds an operation_name
selector for the router service.
Previously, accessing operation_name
was only possible through the response_context
router service selector.
For example:
telemetry:
instrumentation:
instruments:
router:
http.server.request.duration:
attributes:
graphql.operation.name:
operation_name: string
🐛 Fixes
Fix Cache-Control aggregation and age calculation in entity caching (PR #5463)
Enhances the reliability of caching behaviors in the entity cache feature by:
- Ensuring the proper calculation of
max-age
ands-max-age
fields in theCache-Control
header sent to clients. - Setting appropriate default values if a subgraph does not provide a
Cache-Control
header. - Guaranteeing that the
Cache-Control
header is aggregated consistently, even if the plugins is disabled entirely or on specific subgraphs.
Fix telemetry events when trace isn't sampled and preserve attribute types (PR #5464)
Improves accuracy and performance of event telemetry by:
- Displaying custom event attributes even if the trace is not sampled
- Preserving original attribute type instead of converting it to string
- Ensuring
http.response.body.size
andhttp.request.body.size
attributes are treated as numbers, not strings
⚠️ Exercise caution if you have monitoring enabled on your logs, as attribute types may have changed. For example, attributes likehttp.response.status_code
are now numbers (200
) instead of strings ("200"
).
Enable coprocessors for subscriptions (PR #5542)
Ensures that coprocessors correctly handle subscriptions by preventing skipped data from being overwritten.
Improve accuracy of query_planning.plan.duration
(PR #5)
Previously, the apollo.router.query_planning.plan.duration
metric inaccurately included additional processing time beyond query planning. The additional time included pooling time, which is already accounted for in the metric. After this update, apollo.router.query_planning.plan.duration now accurately reflects only the query planning duration without additional processing time.
For example, before the change, metrics reported:
2024-06-21T13:37:27.744592Z WARN apollo.router.query_planning.plan.duration 0.002475708
2024-06-21T13:37:27.744651Z WARN apollo.router.query_planning.total.duration 0.002553958
2024-06-21T13:37:27.748831Z WARN apollo.router.query_planning.plan.duration 0.001635833
2024-06-21T13:37:27.748860Z WARN apollo.router.query_planning.total.duration 0.001677167
Post-change metrics now accurately reflect:
2024-06-21T13:37:27.743465Z WARN apollo.router.query_planning.plan.duration 0.00107725
2024-06-21T13:37:27.744651Z WARN apollo.router.query_planning.total.duration 0.002553958
2024-06-21T13:37:27.748299Z WARN apollo.router.query_planning.plan.duration 0.000827
2024-06-21T13:37:27.748860Z WARN apollo.router.query_planning.total.duration 0.001677167
By @xuorig and @lrlna in #5530
Remove deno_crypto
package due to security vulnerability (Issue #5484)
Removes deno_crypto due to the vulnerability reported in curve25519-dalek
.
Since the router exclusively used deno_crypto
for generating UUIDs using the package's random number generator, this vulnerability had no impact on the router.
Add content-type header to failed auth checks (Issue #5496)
Adds content-type
header when returning AUTH_ERROR
from authentication service.
By @andrewmcgivery in #5497
Implement manual caching for AWS Security Token Service credentials (PR #5508)
In the AWS Security Token Service (STS), the CredentialsProvider
chain includes caching, but this functionality was missing for AssumeRoleProvider
.
This change introduces a custom CredentialsProvider
that functions as a caching layer with these rules:
- Cache Expiry: Credentials retrieved are stored in the cache based on their
credentials.expiry()
time if specified, or indefinitely (ever
) if not. - Automatic Refresh: Five minutes before cached credentials expire, an attempt is made to fetch updated credentials.
- Retry Mechanism: If credential retrieval fails, another attempt is scheduled after a one-minute interval.
- (Coming soon, not included in this change) Manual Refresh: The
CredentialsProvider
will expose arefresh_credentials()
function. This can be manually invoked, for instance, upon receiving a401
error during a subgraph call.
By @o0Ignition0o in #5508
📃 Configuration
Align entity caching configuration structure for subgraph overrides (PR #5474)
Aligns the entity cache configuration structure to the same all
/subgraphs
override pattern found in other parts of the router configuration. For example, see the header propagation configuration.
An automated configuration migration is provided so existing usage is unaffected.
Restrict custom instrument value
s to relevant stages (PR #5472)
Previously, custom instruments at each request lifecycle stage could specify unrelated values, like using event_unit
for a router instrument. Now, only relevant values for each stage are allowed.
Additionally, GraphQL instruments no longer need to specify field_event
. There is no automatic migration for this change since GraphQL instruments are still experimental.
telemetry:
instrumentation:
instruments:
graphql:
# OLD definition of a custom instrument that measures the number of fields
my.unit.instrument:
value: field_unit # Changes to unit
# NEW definition
my.unit.instrument:
value: unit
# OLD
my.custom.instrument:
value: # Changes to not require `field_custom`
field_custom:
list_length: value
# NEW
my.custom.instrument:
value:
list_length: value
The following misconfiguration is now not possible:
router_instrument:
value:
event_custom:
request_header: foo
By @BrynCooke in #5472
🛠 Maintenance
Add cost information to protobuf traces (PR #5430)
Exports query cost information on Apollo protobuf traces if experimental_demand_control
is enabled. Also displays exported information in GraphOS Studio.
By @BrynCooke in #5430
Improve xtask
release process (PR #5275)
Introduces a new xtask
command to automate the release process by:
- Following the commands defined in our
RELEASE_CHECKLIST.md
file - Storing the current state of the process in the
.release-state.json
file - Prompting the user regularly for new info.
These changes remove a lot of the manual environment variable setup and command copying previously required.
Executed the new command by running cargo xtask release start
, then calling cargo xtask release continue
at each step.
Isolate usage of hyper v0.14 types for future compatibility (PR #5175)
Isolates usage of hyper types in response to the recent release of hyper v1.0. The new major version introduced improvements along with breaking changes. The goal is to reduce the impact of these breaking changes for the upcoming Router upgrade to the new hyper, and ensure that future upgrades are straightforward.
This change only affects internal code and doesn't affect the router's public API or execution.
Introduce fuzz testing comparison between the router and monolithic subgraph (PR #5302)
Implements a fuzzer that can run on any router configuration to enhance router robustness and battle test new features.
Adds a router fuzzing target, to compare the result of a query sent to to router vs a monolithic subgraph, with a supergraph schema that points all subgraphs to that same monolith.
The monolithic subgraph consolidates code from typical subgraphs like accounts, products, reviews, and inventory (taken from the starstuff repository).
This setup allows the subgraph to directly handle queries traditionally handled by individual subgraphs.
The invariant we check is that we should get the same result by sending the query to the subgraph directly or through a router that will artificially cut up the query into multiple subgraph requests, according to the supergraph schema.
To execute it:
- Start a router using the schema
fuzz/subgraph/supergraph.graphql
- Start the subgraph with
cargo run --release
infuzz/subgraph
. It will start a subgraph on port 4005. - Start the fuzzer from the repo root with
cargo +nightly fuzz run router
📚 Documentation
Add telemetry docs pages for Dynatrace (PR #5533)
Adds telemetry documentation for Dynatrace metrics and trace exporters.
By @andrewmcgivery in #5533
Fix docs for 'exists' condition (PR #5446)
Fixes documentation example for the exists
condition.
The condition expects a single selector instead of an array.
For example:
telemetry:
instrumentation:
instruments:
router:
my.instrument:
value: duration
type: counter
unit: s
description: "my description"
# ...
# This instrument will only be mutated if the condition evaluates to true
condition:
exists:
request_header: x-req-header
🧪 Experimental
Add experimental extended reference reporting configuration (PR #5331)
Adds an experimental configuration to turn on extended references in Apollo usage reports, including references to input object fields and enum values.
This new configuration (telemetry.apollo.experimental_apollo_metrics_reference_mode: extended
) only works when experimental_apollo_metrics_generation_mode: new
is configured.
Apollo doesn't yet recommend these configurations in production while we continue to verify that the new functionality works as expected.
Add experimental field metric reporting configuration (PR #5443)
Adds an experimental configuration to report field usage metrics to GraphOS Studio without requiring subgraphs to support federated tracing (ftv1
).
The reported field usage data doesn't currently appear in GraphOS Studio.
telemetry:
apollo:
experimental_local_field_metrics: true
There is currently a small performance impact from enabling this feature.
By @tninesling, @Geal, @bryn in #5443
Add experimental h2c communication capability for communicating with coprocessor (Issue #5299)
Allows HTTP/2 Cleartext (h2c) communication with coprocessors for scenarios where the networking architecture/mesh connections don't support or require TLS for outbound communications from the router.
Introduces a new coprocessor.client
configuration. The first and currently only option is experimental_http2
. The available option settings are the same as the as experimental_http2
traffic shaping settings.
disable
- disable HTTP/2, use HTTP/1.1 onlyenable
- HTTP URLs use HTTP/1.1, HTTPS URLs use TLS with either HTTP/1.1 or HTTP/2 based on the TLS handshakehttp2only
- HTTP URLs use h2c, HTTPS URLs use TLS with HTTP/2- not set - defaults to
enable
Note
Configuring experimental_http2: http2only
where the network doesn't support HTTP2 results in a failed coprocessor connection!