-
-
Notifications
You must be signed in to change notification settings - Fork 536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: cache parsed requests by request #1900
Conversation
|
||
if (parsedResult instanceof Error) { | ||
const requestPublicUrl = getPublicUrlFromRequest(request) | ||
parsedGraphQLRequestCache.set(request, undefined) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the only functional difference, since we'll throw on the first request only and undefined
will be the result for other requests, which should avoid spamming the logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this still leads to the log occuring, and may be better anyway, since 1 log per request instead of 1 log per handler
@@ -167,39 +167,44 @@ async function getGraphQLInput(request: Request): Promise<GraphQLInput | null> { | |||
} | |||
} | |||
|
|||
const parsedGraphQLRequestCache = new WeakMap<Request, ParsedGraphQLRequest>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A silly question here: why not lift the caching to the *Handler
individual classes and make it apply for HTTP requests too? The issue you describes is in no way exclusive to the GraphQL handler, it simply surfaces there better.
I think this is a good proposal, and we should add something like private cache: WeakMap<Request, X>
onto the HttpRequest
/GraphQL
request classes. I don't think this belongs to the generic RequestHandler
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I'm open to discussing this in the context of RequestHandler
class too. I suppose there aren't many cases when we wouldn't want to cache the parsing result using the intercepted Request
as the key.
If we move this to the RH class, all userland handlers will benefit from caching automatically. If we decide to go this route, we should make the cache property protected
instead.
The main task at hand is to figure out the cases when caching would be harmful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! I think that's a very natural followup to make this more generalized.
Right now there are a few issues with doing that, which i'll dig into a bit as well. I don't think they're insurmountable, but just issues as it stands now
- We can't define these on the handler classes, because we want to spread the result across handlers.
if there are 10 graphql handlers defined, we only want to parse the request once, and share that across all 10 of the handlers. If we add the cache to the handler class, there will be a single cache per handler, which puts us back where we are now - where parsing occurs in each handler instead of sharing across them
- RH parse doesn't implement anything yet (and should probably be abstract? - not an issue, just an observation) but because of this there's no simple place to define this (and it probably won't go on this class or at least would appear different if we did)
msw/src/core/handlers/RequestHandler.ts
Lines 148 to 157 in b440f12
/** | |
* Parse the intercepted request to extract additional information from it. | |
* Parsed result is then exposed to other methods of this request handler. | |
*/ | |
async parse(_args: { | |
request: Request | |
resolutionContext?: ResponseResolutionContext | |
}): Promise<ParsedResult> { | |
return {} as ParsedResult | |
} |
- HttpHandlers really only match url and extract cookies
msw/src/core/handlers/HttpHandler.ts
Lines 109 to 125 in b440f12
async parse(args: { | |
request: Request | |
resolutionContext?: ResponseResolutionContext | |
}) { | |
const url = new URL(args.request.url) | |
const match = matchRequestUrl( | |
url, | |
this.info.path, | |
args.resolutionContext?.baseUrl, | |
) | |
const cookies = getAllRequestCookies(args.request) | |
return { | |
match, | |
cookies, | |
} | |
} |
We could similarly cache the cookie parsing here, but the url match is a per-handler and not per-request match.
In the end we have 2 slight variations:
Stuff that we parse per request and stuff that we parse per handler, so mostly this would be around helping to manage that efficiently.
graphql parsing is more expensive, so handling that first, and then bringing that elsewhere is probably the lowest cost to highest value - but definitely think there's a pattern we could pull on here for more request caching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, a great point!
This really falls under a bit larger refactoring I had in mind for getResponse
. Maybe we can discuss and kick off that?
Regarding no. 2, the caching would happen in the .run()
method anyway. This way the base class would ensure consistent caching across different overrides of the parse/predicate methods.
I see that the cache should come from above. That's why I mentioned the refactoring. I think the getResponse()
function should be made into a class. An orchestrator of sorts. It would hold the cache, perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that's a bit more intricate here, is the cache is a bit more complicated if we pass it around.
Each Request Handler type would have a separate bit of caching. We might not need to generalize this in practice though, and optimizing a few distinct methods might be less effort for the major payout (as I mentioned on that discussion)
closing this one in favor of experiment in #1905 |
Mostly opening for some discussion, but if we decide this path is reasonable, we should be able to land this.
Leaving in draft state for now, until I get some cycles to make a repro and do a bit of profiling to validate the assumption that this is worth it.
Improve on performance when many graphql handlers are defined and the actual handler is towards the bottom.
PR should look much smaller when viewed without whitespce.
I noticed in a project with many grapqhl handlers that it seemed resolution was a bit slow.
It turns out that we re-parse and re-log the same request for every handler, even though the resulting values produced are identical. This takes a slightly modified approach, where we'll only parse a single time per request object, and immediately return that result if it exists for graphql handlers
This should save on memory allocation and parsing, because we do fewer
request.clone()
and.text() | .json()
calls to parse those, as well as on time since we avoid parsing per handler and instead only parse the result once.If the request objects aren't identical, we'll get new responses, and we cache in a WeakMap, so once the request is gc'd memory in the cache will get cleared