-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref(node): Improve Express URL Parameterization #5450
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good!
In your PR description, where you say "The second handler matches the route /api/users/123
," do you mean that to be the parameterized route instead?
packages/node-integration-tests/suites/express/sentry-trace/baggage-header-out/server.ts
Outdated
Show resolved
Hide resolved
Co-authored-by: Katie Byers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
make our URL parameterization for Express (introduced in #5450) compatible with RegEx-defined routes. Previously, as reported in #5481, our parameterization logic would cause a crash because instead of a string, the matched route would be of type `RegExp`. This PR adjusts our logic so that we detect if we get a matched string our regex. In the latter case, we also append a `'/'` to the reconstructed partial route name so that the regex is closed.
Fix an error thrown from our Node SDK that when our Express router parameterization logic (introduced in #5450) would try to parameterize a route consisting of an array of paths (which could be strings or RegExes). Since a crucial part of our parameterization approach is to determine if the currently resolved layer is the "last" part of the route (in which case we update the transaction name and source), this patch also makes a few modifications to determine this correctly for arrays. In order to do so, we check for the number of URL segments in the original URL vs. that of the parameterized URL. In the case of arrays, we'll likely have more segments than in the raw URL path. Therefore, we balance this out by determining the number of extra segments we got from the array. Additionally, added tests for array routes.
Hotfixes a problem in our router instrumentation introduced in #5450 which would cause Express 5 Node apps to crash on startup. Because the internals of Express 5 (which is still in beta) have changed quite a bit compared to Express 4, there is unfortunately no quick way of supporting the new version in our current router instrumentation. Therefore, this patch simply checks if the router we retrieve from Express 4 apps (which we get from `app._router`) exists. In case it does, we can patch it; in case it doesn't, we know that the integration is either used with Express 3 or 5. In both cases, we early return and do not patch the router. We can revisit adding proper support for early URL parameterization of Express 5 apps but for now, this PR will unblock Express 5 users by skipping instrumentation. This means that for Express 5, we fall back to our old way of instrumenting routes (which means we get paramterized routes for transaction names but only after the route handler was executed and the transaction is finished). fixes #5501
Hi Team I'm wondering if this PR also solves express with a nested router? |
Hi @brotherko I think transaction-name-parameterization-wise nested routers should work. However, there are still missing spans on nested routers as outlined in #5510. If you want to try it out, feel free to do so and open an issue in case something is not working as expected. Thanks! |
Hi @Lms24 Thanks i'm not sure if we are referring to the same kind of parameterization here's the example
When I browse => route1/route2/someparameter Is it the intended behavior? Or I'm just misusing express? thank you |
This PR improves the URL parameterization for transaction names in our Express integration.
Previously, we would only obtain a parameterized version of the incoming request's URL after the registered handler(s) finished their job, right before we called
transaction.finish()
. This is definitely too late for DSC propagation, if the handlers made any outgoing request. In that case, the DSC (propagated via thebaggage
header) would not contain a transaction name.As of this PR, we patch the
Express.Router
prototype, more precisely theprocess_params
function. This function contains a reference to the incomingreq
object as well as to thelayer
that matched the route. We hook into the function to extract the matched and parameterized partial route of the layer and we attach it to thereq
object. This happens for every matched layer, until the entire route is resolved, in which stichtched together the entire parameterized route.For the time being, we deliberately ignore route registrations with wildcards (e.g.
app.get('/*', ...)
) when stitching together the parameterized route. After we added a new part to the reconstructed route, we check if it has the same number of segments as the original (raw) URL. In case it has, we assume that the parameterized route is complete and we update the active transaction with the parameterized name. Additionally, we set the transaction source toroute
at this point. In case we never get to the point where we have an equal amount of URL segments, the transaction name is never updated, meaning its source staysurl
(and therefore, it's not added to the DSC). In that case, we continue to update the transaction name like before right before the end of the transaction.After reading the Express source code, we confirmed that the process for resolving parts of a route and handling it is performed sequentially for each matched route segment. This means that each matched layer's handle function is invoked before the next part of the URL is matched. We therefore still have timing issues around DSC population if any of those intermediate handler functions make outgoing requests. A simple example:
The incoming request is
/api/users/123
/api/users/123
and sourceurl
/*
./
/api/users/123
/api/users/:id
/api/users/:id
and set its source toroute
This example shows that we still have timing issues w.r.t DSC propagation. That is, assuming that the Node SDK is in this case the head of the trace and it didn't intercept incoming DSC from the incoming request.
However, with this PR we now at least have the chance to get the correct transaction name early enough.
ref: #5342