-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(resolution): Fix resolution of querystringed imports in ESM files #336
fix(resolution): Fix resolution of querystringed imports in ESM files #336
Conversation
…call in `maybeEmitDep`
@styfle New PR because @lobsterkatie is out over the holidays. You mind taking a look? It's the same changes but with the feedback on Btw, feel free to push any changes you want to make directly to this PR! |
@lforst There are merge conflicts in the original PR as well as this PR |
src/node-file-trace.ts
Outdated
queryString = specifierUrl.search; | ||
|
||
if (specifierUrl.search) { | ||
path = specifier.replace(specifierUrl.search, ''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could replace the wrong part of the URL if it had repeating content in the string.
How about using the parsed result instead?
path = specifier.replace(specifierUrl.search, ''); | |
path = specifierUrl.pathname; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't use pathname
because we want to exclusively strip the query string off the specifier
. If we use pathname
we would also strip away suff like hash, protocol, host etc. (I know it is a bit edge-casey but still, let's do as little as possible to the specifier here)
I will rename the function to something like splitQueryStringFromSpecifier
and change the return object signature to { queryString, remainingSpecifier }
to make its purpose more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the repeating content, do you have examples of where this might go wrong?
Afaik, as per URL standard, the first ?
will denote the start of the query/search
part of the URL. search
should always start with a ?
and since .replace()
will only replace the first occurrence of a match (unless something like the regex g
flag is used) this should just work IMO.
Another reason not to use pathname
is when used via new URL
it strips off the relative url segments:
new URL("../../src/mymodule", "file://").pathname === "/src/mymodule" // instead of "../../mymodule"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we would also strip away stuff like hash, protocol, host etc.
Did you add tests for those cases? How does Node.js handle those cases?
since .replace() will only replace the first occurrence of a match
Ah yes, you're right.
new URL("../../src/mymodule", "file://").pathname === "/src/mymodule")
Oh good catch! CI is failing though so something is still wrong.
src/node-file-trace.ts
Outdated
let queryString = null; | ||
|
||
if (!cjsResolve) { | ||
const specifierUrl = url.parse(specifier); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just learned url.parse()
was deprecated (again) so lets use new URL()
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestion on how to deal with URL not being able to handle relative paths? Would it be ok for us to use some random base url (eg http://example.com
, file://
, or placeholder-protocol://
)?
Implementation based on best intentions: 44f3d3f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a job for pathToFileURL()
or maybe fileURLToPath()
…string-when-resolving
We don't need this anymore |
Note: This PR is built on top of #322 which I took over because @lobsterkatie is currently out.
Currently, there are two problems with the way
@vercel/nft
handles ESM imports with querystrings (import * as something from './someFile?someQuerystring';
):?
as it would any other character. As a result, it fails to resolve such imports to the non-querystringed files to which they correspond. (In the above example, it looks for a file namedsomeFile?someQuerystring.js
and, as one would expect, doesn't find it.) While this is the correct behavior forrequire
calls in CJS files, it doesn't match the way that Node handles ESM imports (which is to strip the querystring off before attempting to resolve the file).resolve
method is provided, and that custom method does strip querystrings from ESM imports, the querystrings are then not put back on before the imported module is processed. This matters, for example, in cases wherereadFile
is also overridden, with a method whose output is affected by the querystring's value. (Webpack is such a system, so one place this shows up is in Next.js'sNextTraceEntryPointsPlugin
.)This fixes both of those problems by selectively stripping and re-adding the querystring in ESM imports at various points in the
nodeFileTrace
lifecycle. More specifically, inresolveDependecy
and inemitDependency
and its helpers, we strip the querystring (and/or don't add it back), with the exception of:readFile
, we keep the querystring, since some implementations ofreadFile
, including the one in the nft webpack plugin, read different modules out of memory with different querystrings).resolve
, we add the querystring back in, since this is also something which can be supplied by the user, and and the user-supplied version might care about query strings.emitDependency
, we add the querystring back in, since we need it there to be able to start the whole cycle over.Notes:
cjsResolve
to the recursive call ofemitDependency
. Rather than changeemitDependency
's signature, I opted to makeemitDependency
a wrapper for an inner function (analyzeAndEmitDependency
) with the updated signature. Recursive calls now go to the inner function.@sentry/nextjs
team are interested in this because we use a webpack loader to insert in the dependency tree a proxy module for each page, which allows us to automatically wrap users' exported functions (getServerSideProps
and friends, API route handlers, etc). To do this, we replace the code from/path/to/pages/somePage.js
with our proxy code, which includes an import from'./somePage?__sentry_wrapped__'
. When webpack then crawls our proxy module looking for dependencies, the querystring tricks it into thinking it hasn't yet dealt with that page, so it forwards it to our loader a second time rather than skipping it. When our loader sees the querystring, it knows the code replacement has already been done, and returns the original page code untouched, thereby preserving the continuity of the dependency tree. This fix ensures that@vercel/nft
successfully traces the modified tree.cjs-querystring
) required a little gymnastics in order to not break CI on Windows. It turns out that skipping the test on Windows wasn't enough - the mere fact that a file exists with a?
in its name is enough to cause Windows not to even be able to check out the commit. Therefore, the file is stored without any punctuation in its name, and before the tests run on non-Windows platforms, it is copied to a version which does have the punctuation, and which is git-ignored to enforce it not ending up in the repo in the future by accident. The dummy, no-punctuation version is stored in a folder in order to differentiate its path from that of the punctuated version by more than just the punctuation. That way, if the punctuated version is found, it's clear it's not just because the punctuation is somehow being ignored.Fixes getsentry/sentry-javascript#5998.
Fixes #319.