-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(resolution): Fix resolution of querystringed imports in ESM files #322
base: main
Are you sure you want to change the base?
Changes from all commits
aebf66c
fb93620
e0bfe0d
77c83bd
a8c294d
142658c
8c3ec63
54431db
cc660e2
bd5feee
a7c4822
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,3 +4,4 @@ out | |
coverage | ||
test/**/dist | ||
test/**/actual.js | ||
test/unit/cjs-querystring/who?what?idk!.js |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,6 +12,30 @@ const fsReadFile = fs.promises.readFile; | |
const fsReadlink = fs.promises.readlink; | ||
const fsStat = fs.promises.stat; | ||
|
||
type ParseSpecifierResult = { | ||
path: string; | ||
queryString: string | null | ||
} | ||
|
||
// Splits an ESM specifier into path and querystring (including the leading `?`). (If the specifier is CJS, | ||
// it is passed through untouched.) | ||
export function parseSpecifier(specifier: string, cjsResolve: boolean = true): ParseSpecifierResult { | ||
let path = specifier; | ||
let queryString = null; | ||
|
||
if (!cjsResolve) { | ||
// Regex which splits a specifier into path and querystring, inspired by that in `enhanced-resolve` | ||
// https://github.com/webpack/enhanced-resolve/blob/157ed9bcc381857d979e56d2f20a5a17c6362bff/lib/util/identifier.js#L8 | ||
const match = /^(#?(?:\0.|[^?#\0])*)(\?(?:\0.|[^\0])*)?$/.exec(specifier); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks slightly different from the webpack regex. Can we instead use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In my testing, node itself seemed to treat
None of them seemed ideal, and so I decided to see how webpack does it (since I was in the nextjs nft webpack plugin with the debugger already), figuring that the most likely context for someone having the querystring problem in the first place was them using a webpack loader with options. After a lot of stepping through code, eventually I landed in the (I gave the regex I'm using above a name the same length as the original regex's name in I can certainly make There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Also, I trust node's own url parsing more than a modified regex which might suffer from ReDoS. Now that there are tests, we should be able to swap this implementation between regex, url.parse() or new URL() easily. |
||
if (match) { | ||
path = match[1] | ||
queryString = match[2] | ||
} | ||
} | ||
|
||
return {path, queryString}; | ||
} | ||
|
||
function inPath (path: string, parent: string) { | ||
const pathWithSep = join(parent, sep); | ||
return path.startsWith(pathWithSep) && path !== pathWithSep; | ||
|
@@ -215,19 +239,21 @@ export class Job { | |
} | ||
|
||
private maybeEmitDep = async (dep: string, path: string, cjsResolve: boolean) => { | ||
// Only affects ESM dependencies | ||
const { path: strippedDep, queryString = '' } = parseSpecifier(dep, cjsResolve) | ||
let resolved: string | string[] = ''; | ||
let error: Error | undefined; | ||
try { | ||
resolved = await this.resolve(dep, path, this, cjsResolve); | ||
} catch (e1: any) { | ||
error = e1; | ||
try { | ||
if (this.ts && dep.endsWith('.js') && e1 instanceof NotFoundError) { | ||
if (this.ts && strippedDep.endsWith('.js') && e1 instanceof NotFoundError) { | ||
// TS with ESM relative import paths need full extensions | ||
// (we have to write import "./foo.js" instead of import "./foo") | ||
// See https://www.typescriptlang.org/docs/handbook/esm-node.html | ||
const depTS = dep.slice(0, -3) + '.ts'; | ||
resolved = await this.resolve(depTS, path, this, cjsResolve); | ||
const strippedDepTS = strippedDep.slice(0, -3) + '.ts'; | ||
resolved = await this.resolve(strippedDepTS + queryString, path, this, cjsResolve); | ||
error = undefined; | ||
} | ||
} catch (e2: any) { | ||
|
@@ -240,16 +266,19 @@ export class Job { | |
return; | ||
} | ||
|
||
if (Array.isArray(resolved)) { | ||
for (const item of resolved) { | ||
// ignore builtins | ||
if (item.startsWith('node:')) return; | ||
await this.emitDependency(item, path); | ||
// For simplicity, force `resolved` to be an array | ||
resolved = Array.isArray(resolved) ? resolved : [resolved]; | ||
for (let item of resolved) { | ||
// ignore builtins for the purposes of both tracing and querystring handling (neither Node | ||
// nor Webpack can handle querystrings on `node:xxx` imports). | ||
if (item.startsWith('node:')) return; | ||
|
||
// If querystring was stripped during resolution, restore it | ||
if (queryString && !item.endsWith(queryString)) { | ||
item += queryString; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can the query string exist on node built-ins? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, as it turns out. I did some testing: (Note that this is all for ESM imports, since we know that CJS just treats
So not something we have to worry about. Might be worth saying so in a comment, though. I'll do that. |
||
} else { | ||
// ignore builtins | ||
if (resolved.startsWith('node:')) return; | ||
await this.emitDependency(resolved, path); | ||
|
||
await this.analyzeAndEmitDependency(item, path, cjsResolve); | ||
} | ||
} | ||
|
||
|
@@ -343,13 +372,23 @@ export class Job { | |
} | ||
|
||
async emitDependency (path: string, parent?: string) { | ||
if (this.processed.has(path)) { | ||
return this.analyzeAndEmitDependency(path, parent) | ||
} | ||
|
||
private async analyzeAndEmitDependency(rawPath: string, parent?: string, cjsResolve?: boolean) { | ||
|
||
// Strip the querystring, if any. (Only affects ESM dependencies.) | ||
const { path } = parseSpecifier(rawPath, cjsResolve) | ||
|
||
// Since different querystrings may lead to different results, include the full path | ||
// when noting whether or not we've already seen this path | ||
if (this.processed.has(rawPath)) { | ||
if (parent) { | ||
await this.emitFile(path, 'dependency', parent) | ||
} | ||
return | ||
}; | ||
this.processed.add(path); | ||
this.processed.add(rawPath); | ||
|
||
const emitted = await this.emitFile(path, 'dependency', parent); | ||
if (!emitted) return; | ||
|
@@ -368,18 +407,34 @@ export class Job { | |
|
||
let analyzeResult: AnalyzeResult; | ||
|
||
const cachedAnalysis = this.analysisCache.get(path); | ||
// Since different querystrings may lead to analyses, include the full path when caching | ||
const cachedAnalysis = this.analysisCache.get(rawPath); | ||
if (cachedAnalysis) { | ||
analyzeResult = cachedAnalysis; | ||
} | ||
else { | ||
const source = await this.readFile(path); | ||
if (source === null) throw new Error('File ' + path + ' does not exist.'); | ||
// By default, `this.readFile` is the regular `fs.readFile`, but a different implementation | ||
// can be specified via a user option in the main `nodeFileTrace` entrypoint. Depending on | ||
// that implementation, having a querystring on the end of the path may either a) be necessary | ||
// in order to specify the right module from which to read, or b) lead to an `ENOENT: no such | ||
// file or directory` error because it thinks the querystring is part of the filename. We | ||
// therefore try it with the querystring first, but have the non-querystringed version as a | ||
// fallback. (If there's no `?` in the given path, `rawPath` will equal `path`, so order is a | ||
// moot point.) | ||
const source = await this.readFile(rawPath) || await this.readFile(path) ; | ||
|
||
if (source === null) { | ||
const errorMessage = path === rawPath | ||
? 'File ' + path + ' does not exist.' | ||
: 'Neither ' + path + ' nor ' + rawPath + ' exists.' | ||
throw new Error(errorMessage); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like there is a test missing for this error message There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
} | ||
|
||
// analyze should not have any side-effects e.g. calling `job.emitFile` | ||
// directly as this will not be included in the cachedAnalysis and won't | ||
// be emit for successive runs that leverage the cache | ||
analyzeResult = await analyze(path, source.toString(), this); | ||
this.analysisCache.set(path, analyzeResult); | ||
this.analysisCache.set(rawPath, analyzeResult); | ||
} | ||
|
||
const { deps, imports, assets, isESM } = analyzeResult; | ||
|
@@ -393,12 +448,12 @@ export class Job { | |
const ext = extname(asset); | ||
if (ext === '.js' || ext === '.mjs' || ext === '.node' || ext === '' || | ||
this.ts && (ext === '.ts' || ext === '.tsx') && asset.startsWith(this.base) && asset.slice(this.base.length).indexOf(sep + 'node_modules' + sep) === -1) | ||
await this.emitDependency(asset, path); | ||
await this.analyzeAndEmitDependency(asset, path, !isESM); | ||
else | ||
await this.emitFile(asset, 'asset', path); | ||
}), | ||
...[...deps].map(async dep => this.maybeEmitDep(dep, path, !isESM)), | ||
...[...imports].map(async dep => this.maybeEmitDep(dep, path, false)), | ||
...[...deps].map(async dep => this.maybeEmitDep(dep, rawPath, !isESM)), | ||
...[...imports].map(async dep => this.maybeEmitDep(dep, rawPath, false)), | ||
]); | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
// Test that CJS files treat question marks in filenames as any other character, | ||
// matching Node behavior | ||
|
||
// https://www.youtube.com/watch?v=2ve20PVNZ18 | ||
|
||
const baseball = require('./who?what?idk!'); | ||
console.log(baseball.players); |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
module.exports = { | ||
players: { | ||
first: 'Who', | ||
second: 'What', | ||
third: "I Don't Know", | ||
left: 'Why', | ||
center: 'Because', | ||
pitcher: 'Tomorrow', | ||
catcher: 'Today', | ||
shortstop: "I Don't Give a Damn!", | ||
}, | ||
}; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
[ | ||
"package.json", | ||
"test/unit/cjs-querystring/input.js", | ||
"test/unit/cjs-querystring/who?what?idk!.js" | ||
] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
import { numSpecies } from "./bear.mjs?beaver?bison"; | ||
console.log(`There are ${numSpecies} species of bears.`); | ||
|
||
export const food = "termites"; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
import * as cheetah from "./cheetah.mjs?cow=chipmunk"; | ||
console.log(`Cheetahs can run ${cheetah.topSpeed} mph.`); | ||
|
||
export const numSpecies = 8; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
export const topSpeed = 65; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
// Test that querystrings of various forms get stripped from esm imports when those | ||
// imports contain the `.mjs` file extension | ||
|
||
import * as aardvark from "./animalFacts/aardvark.mjs?anteater"; | ||
|
||
console.log(`Aardvarks eat ${aardvark.food}.`); |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
[ | ||
"test/unit/esm-querystring-mjs/animalFacts/aardvark.mjs", | ||
"test/unit/esm-querystring-mjs/animalFacts/bear.mjs", | ||
"test/unit/esm-querystring-mjs/animalFacts/cheetah.mjs", | ||
"test/unit/esm-querystring-mjs/input.js", | ||
"test/unit/esm-querystring-mjs/package.json" | ||
] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"private": true, | ||
"type": "module" | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
import { numSpecies } from './bear?beaver?bison'; | ||
console.log(`There are ${numSpecies} species of bears.`); | ||
|
||
export const food = 'termites'; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
import * as cheetah from './cheetah?cow=chipmunk'; | ||
console.log(`Cheetahs can run ${cheetah.topSpeed} mph.`); | ||
|
||
export const numSpecies = 8; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
export const topSpeed = 65; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
// Test that querystrings of various forms get stripped from esm imports | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Lets add |
||
|
||
import * as aardvark from './animalFacts/aardvark?anteater'; | ||
|
||
console.log(`Aardvarks eat ${aardvark.food}.`); |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
[ | ||
"test/unit/esm-querystring/animalFacts/aardvark.js", | ||
"test/unit/esm-querystring/animalFacts/bear.js", | ||
"test/unit/esm-querystring/animalFacts/cheetah.js", | ||
"test/unit/esm-querystring/input.js", | ||
"test/unit/esm-querystring/package.json" | ||
] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"private": true, | ||
"type": "module" | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
import * as dep from './dep'; | ||
|
||
export const dogs = ['Charlie', 'Maisey']; | ||
export const cats = ['Piper']; | ||
export const rats = dep.rats; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
export const rats = ['Debra']; |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
// Test that if a file and the same file with a querystring correspond to different | ||
// modules in memory, one can successfully import the other. The import chain | ||
// goes `input` (this file) -> `base?__withQuery` -> `base` -> `dep`, which means | ||
// that if `dep` shows up in `output`, we know that both `base?__withQuery` and | ||
// `base` have been loaded successfully. | ||
|
||
import * as baseWithQuery from './base?__withQuery'; | ||
console.log('Dogs:', baseWithQuery.dogs); | ||
console.log('Cats:', baseWithQuery.cats); | ||
console.log('Rats:', baseWithQuery.rats); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets not use a default param or else the caller might forget
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we need the default to be there, because the first time this is called (at the top of
analyzeAndEmitDependency
when it's called byemitDependency
), we don't have acjsResolve
value. Having it default totrue
works in that case, though, because that only happens at the root of the tree, whennodeFileTrace
callsemitDependency
on an actual file (not an import specifier), which we know won't have a querystring.