Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(compartment-mapper): Support dynamic require/importNow #2310

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

boneskull
Copy link
Contributor

@boneskull boneskull commented Jun 7, 2024

Description

  • This PR adds support for dynamic requires via loadFromMap() and link() (in import-lite.js and link.js, respectively). importLocation()'s signature has also been modified for support.

    To use this feature in either function, the following must be true:

    1. The moduleTransforms option (in the appropriate options parameter) must not be present. These are asynchronous module transforms, which cannot be used by dynamic require.
    2. The ReadPowers param must be a proper ReadPowers object (not just a ReadFn) and must contain both the new maybeReadSync, new isAbsolute, and fileURLToPath functions. The new type representing this is SyncReadPowers.

    If all of the above are true, then a compartment will be allowed to dynamically require something. If that thing still cannot be found in the compartment map, the sync "fallback" exit hook (see next item) will be executed.

  • The new importHookNow property can be provided via options, which is a synchronous exit module import hook.

  • About SyncReadPowers:

    • SyncReadPowers.maybeReadSync() is necessary to read files synchronously which is necessary to load them synchronously. This fails gracefully if the file is not found; the implementation is platform-dependent.
    • SyncReadPowers.isAbsolute() is necessary to determine if the module specifier of a dynamic require is absolute. If it is, it could be just about anything, and must be loaded via the user-provided importNowHook (the sync exit module import hook).
    • SyncReadPowers.fileURLToPath() is needed for __filename and __dirname, which is often used to create absolute paths for requiring dynamically.
  • As an alternative to moduleTransforms, synchronous module transforms may be provided via the new syncModuleTransforms object. In a non-dynamic-require use-case, if present, syncModuleTransforms are combined with the moduleTransforms option; all sync module transforms are module transforms, but not all module transforms are sync module transforms.

  • All builtin parsers are now synchronous. User-defined parsers can be async, but this will disable dynamic require support.

  • @endo/evasive-transform now exports evadeCensorSync() in addition to evadeCensor(). This is possible because I've swapped the async-only source-map with source-map-js, which is a fork of the former before it went async-only. source-map-js claims comparable performance.

Security Considerations

Dynamically requiring an exit module (e.g., a Node.js builtin) requires a user-defined hook, which has the same security considerations as a user-defined exit module hook.

Swapping out a dependency (source-map-js for source-map) incurs risk.

Scaling Considerations

n/a

Documentation Considerations

Should be announced as a user-facing feature

Testing Considerations

I've added some fixtures and tested around the conditionals I've added, but am open to any suggestions for additional coverage.

Compatibility Considerations

This increases ecosystem compatibility considerably; use of dynamic require in the ecosystem is not rare.

For example, most packages which ship a native module will be using dynamic require, because the filepath of the build artifact is dependent upon the platform and architecture.

Upgrade Considerations

Everything else should be backwards-compatible, as long as source-map-js does as it says on the tin.

Users of @endo/evasive-transform may note that native modules are neither downloaded/compiled (due to the switch from source-map to source-map-js).


UPDATE: Aug 18 2024

This PR now targets the boneskull/supertramp branch, which is PR #2263

@boneskull boneskull self-assigned this Jun 18, 2024
@boneskull boneskull marked this pull request as ready for review June 18, 2024 20:57
@boneskull
Copy link
Contributor Author

boneskull commented Jun 18, 2024

If anyone can point me in the direction of why the tests are failing, that'd be helpful; otherwise I'll just plug at it. Plugged.

compartmentDescriptor.modules = moduleDescriptors;

let { policy } = compartmentDescriptor;
policy = policy || Object.create(null);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for typescript

Comment on lines 483 to 679
if ('packages' in policy && typeof policy.packages === 'object') {
for (const [pkgName, policyItem] of entries(policy.packages)) {
if (
!(pkgName in compartmentDescriptor.modules) &&
pkgName in compartmentDescriptor.scopes &&
policyItem
) {
compartmentDescriptor.modules[pkgName] =
compartmentDescriptor.scopes[pkgName];
}
}
}
Copy link
Contributor Author

@boneskull boneskull Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure if this is correct, but it creates links between compartment descriptors based only on policy--where Endo would not have detected them otherwise.

Checking if dynamic: true is in policy at this point causes many tests to fail, because this ends up being a code path which many tests take. Why? Because the behavior is opt-out.

With regards to that, in addition to the policy, perhaps we should add a dynamic option to ImportLocationOptions or whathaveyou, so you can explicitly opt-in to using importNowHook. Right now, it's an opt-out based on the lack of a moduleTransforms (async module transforms) option and the shape of readPowers. It's not until later that we take the policy into account. Another way to put it: we're creating an importNowHook because a) we have the tools to do so, and b) something might dynamically require something else.

If we did that, I'd be able to delete some LoC, and we'd reduce the risk of introducing backwards-incompatible changes. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

less code and better compatibility. Sounds good.
Also, explicit opt-in would allow an early failure if ReadPowers or transforms don't match the intent to enable dynamic require.

Comment on lines 507 to 602
// Collate candidate locations for the moduleSpecifier,
// to support Node.js conventions and similar.
const candidates = [moduleSpecifier];
for (const candidateSuffix of searchSuffixes) {
candidates.push(`${moduleSpecifier}${candidateSuffix}`);
}

for (const candidateSpecifier of candidates) {
const candidateModuleDescriptor = moduleDescriptors[candidateSpecifier];
if (candidateModuleDescriptor !== undefined) {
const { compartment: candidateCompartmentName = packageLocation } =
candidateModuleDescriptor;
const candidateCompartment = compartments[candidateCompartmentName];
if (candidateCompartment === undefined) {
throw Error(
`compartment missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
// modify compartmentMap to include this redirect
const candidateCompartmentDescriptor =
compartmentDescriptors[candidateCompartmentName];
if (candidateCompartmentDescriptor === undefined) {
throw Error(
`compartmentDescriptor missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
candidateCompartmentDescriptor.modules[moduleSpecifier] =
candidateModuleDescriptor;
// return a redirect
/** @type {RedirectStaticModuleInterface} */
const record = {
specifier: candidateSpecifier,
compartment: candidateCompartment,
};
return record;
}

// Using a specifier as a location.
// This is not always valid.
// But, for Node.js, when the specifier is relative and not a directory
// name, they are usable as URL's.
const moduleLocation = resolveLocation(
candidateSpecifier,
packageLocation,
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copypasta

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of duplication. I think this is worth trying to bounce on a trampoline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This is a good idea, but is also unfortunate because it will take me longer to land this.

Comment on lines 554 to 613
let moduleBytes;
try {
moduleBytes = readSync(moduleLocation);
} catch (err) {
if (err && err.code === 'ENOENT') {
// might be an exit module. use the fallback `exitModuleImportNowHook` to import it
// eslint-disable-next-line no-continue
continue;
}
throw err;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we know something is an exit module? maybe we should check the compartment descriptor, and only use the fallback if the thing doesn't exist there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not arrived at a hard decision for this question. Falling through the bottom of the import hook should be enough, but might not be. An option we did not have when I first wrote this: we can identify “host modules” by the presence of a prefix: like node: or endo:. These would be guaranteed to escape the Node.js style mappings.

In the interest of keeping coupling to a specific specifier resolution strategy low, I am leaning heavily toward “pass through if nothing in the compartment map matches.”

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prefix option would be really unfortunate for ecosystem compatibility. I don't think we'll ever get all packages to switch form fs to node:fs.

For the sake of completion: if importHook for exit modules is not provided or returns nothing we still need to fail with an import error.

Comment on lines 707 to 785
/**
* @typedef FsPromisesApi
* @property {(filepath: string) => Promise<string>} realpath
* @property {WriteFn} writeFile
* @property {ReadFn} readFile
*/

/**
* @typedef FsAPI
* @property {FsPromisesApi} promises
* @property {ReadSyncFn} readFileSync
*/

/**
* @typedef UrlAPI
* @property {(location: string | URL) => string} fileURLToPath
* @property {(path: string) => URL} pathToFileURL
*/

/**
* @typedef CryptoAPI
* @property {typeof import('crypto').createHash} createHash
*/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not strictly necessary, but I found it helpful. YMMV

Copy link
Member

@kriskowal kriskowal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary feedback.

packages/compartment-mapper/src/import-hook.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-hook.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-hook.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-hook.js Outdated Show resolved Hide resolved
Comment on lines 507 to 602
// Collate candidate locations for the moduleSpecifier,
// to support Node.js conventions and similar.
const candidates = [moduleSpecifier];
for (const candidateSuffix of searchSuffixes) {
candidates.push(`${moduleSpecifier}${candidateSuffix}`);
}

for (const candidateSpecifier of candidates) {
const candidateModuleDescriptor = moduleDescriptors[candidateSpecifier];
if (candidateModuleDescriptor !== undefined) {
const { compartment: candidateCompartmentName = packageLocation } =
candidateModuleDescriptor;
const candidateCompartment = compartments[candidateCompartmentName];
if (candidateCompartment === undefined) {
throw Error(
`compartment missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
// modify compartmentMap to include this redirect
const candidateCompartmentDescriptor =
compartmentDescriptors[candidateCompartmentName];
if (candidateCompartmentDescriptor === undefined) {
throw Error(
`compartmentDescriptor missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
candidateCompartmentDescriptor.modules[moduleSpecifier] =
candidateModuleDescriptor;
// return a redirect
/** @type {RedirectStaticModuleInterface} */
const record = {
specifier: candidateSpecifier,
compartment: candidateCompartment,
};
return record;
}

// Using a specifier as a location.
// This is not always valid.
// But, for Node.js, when the specifier is relative and not a directory
// name, they are usable as URL's.
const moduleLocation = resolveLocation(
candidateSpecifier,
packageLocation,
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of duplication. I think this is worth trying to bounce on a trampoline.

packages/compartment-mapper/src/import-lite.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-lite.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-lite.js Outdated Show resolved Hide resolved
packages/evasive-transform/src/location-unmapper.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-lite.js Outdated Show resolved Hide resolved
@boneskull
Copy link
Contributor Author

Going to extract the changes to @endo/evasive-transform into a separate PR.

@boneskull
Copy link
Contributor Author

Ref: #2332

cc @kriskowal

@boneskull
Copy link
Contributor Author

Once #2332 is merged, I can rebase this onto master, which should eliminate the commit from this PR's history. So let's sit on it until then.

@boneskull boneskull force-pushed the boneskull/dynamic branch 3 times, most recently from 04f003b to b09a0cd Compare August 15, 2024 23:07
@boneskull boneskull changed the base branch from master to boneskull/supertramp August 15, 2024 23:08
@naugtur
Copy link
Member

naugtur commented Aug 19, 2024

note on source-map-js

I don't see where the source-map-js is added. The documentation says it's a fork of source-map but with some optimizations added. One of the optimizations listed in the article they cite as source of the optimizations is using a Function constructor to create variants of sorting function. That's a potential concern (less of a security concern, more of a compatibility concern if Endo code is supposed to run correctly under lockdown)

for (const [someDescriptorName, someDescriptor] of entries(
compartmentDescriptors,
)) {
if (someDescriptorName !== ATTENUATORS_COMPARTMENT) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exclusion makes a lot of sense.

Copy link
Member

@kriskowal kriskowal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been a journey and I commend you, and apologize profusely for taking too long to review. That you have a green CI speaks volumes.

This is in the right shape. I have mostly annoying nits about naming and idioms below, most of which can justifiably be ignored at your discretion.

I only request you consider seriously the whether we can bear O(compartments) lookup and think about what it would take to do a more efficient prefix match. I do not want you to build a prefix trie, but if iteratively popping components off a module specifier until you find a matching compartment location works, please consider that, or explain why that won’t work or isn’t necessary.

There is a breaking change here now that parsers are part of the public API, but I am not worried about that since all usage of them is local to Endojs/Endo and is loosely coupled to the specific interface. It might be worth noting. It might not.

edit: and please look into the weak correspondence between package and dependency name, at least explain why we can’t use the compartment keys to establish that correlation. More book-keeping is fine if it’s more precise.

/**
* A resolution of `undefined` indicates `ENOENT` or the equivalent.
*
* @callback MaybeReadSyncFn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a mild preference to grow outward from the precedent of Now to mean synchronous in names, instead of the Node.js precedent of Sync suffix. There’s a reasonable chance we will follow the lead of Moddable XS and be proposing corresponding import.now dynamic synchronous import syntax. I’ve seeded the idea it in TC39 channels in any case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a blocker? There's a lot of stuff to rename, if so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but it will be harder to change later.

packages/compartment-mapper/src/import-hook.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/import-hook.js Outdated Show resolved Hide resolved
@@ -68,10 +91,68 @@ const nodejsConventionSearchSuffixes = [
'/index.node',
];

/**
* Given a module specifier which is an absolute path, attempt to match it with
* an existing compartment; return a {@link RedirectStaticModuleInterface} if found.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some work toward converging with XS for the shape of module descriptors. RedirectStaticModuleInterface is deprecated in favor of a SourceModuleDescriptor or NamespaceModuleDescriptor, but both work still.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kriskowal

Neither SourceModuleDescriptor nor NamespaceModuleDescriptor is valid, in this case. The former because we do not have a source property, and the latter because we do not have a namespace property; all we have is {string} specifier and {Compartment} compartment.

Copy link
Member

@kriskowal kriskowal Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is only that the the Moddable folks preferred to rename source overspecifier so that it is clear that the referent (still a string) means that this compartment adopts the other compartment’s ModuleSource, vs namespace when it shares the other compartment’s module instance. The name specifier left that ambiguous.

packages/compartment-mapper/src/import-hook.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/link.js Show resolved Hide resolved
packages/compartment-mapper/src/map-parser.js Outdated Show resolved Hide resolved
packages/compartment-mapper/src/node-powers.js Outdated Show resolved Hide resolved
@kriskowal kriskowal changed the title support dynamic requires feat(compartment-mapper): Support dynamic require/importNow Sep 26, 2024
@kriskowal
Copy link
Member

Here’s an Agoric SDK integration testing PR Agoric/agoric-sdk#10159

The current CI run failed because trampoline is not in npm yet, and at least one version needs to go up for that to sync properly. I’ll take that on.

@boneskull
Copy link
Contributor Author

@kriskowal I've applied all of the naming-related changes you've requested.

}
}

// for (const [someDescriptorName, someDescriptor] of entries(
Copy link
Member

@naugtur naugtur Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover from debugging?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous implementation. Good catch; let’s delete this dead code.

Copy link
Member

@kriskowal kriskowal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in the right shape. I will not need another review. Please review prior comments and delete dead code. Naming nits are optional.

Please add an entry to NEWS.md for the new feature.

You should be able to merge when you are ready. Please rebase on master before merging, ensure that CI is green, and either squash with a merge commit or manually merge fixups and edits into the prior feature commits so that history appears linear. Also, review your merge commit description, which is generated from the PR description.

I would like to get this proven in Agoric SDK CI, so I’m going to produce a release for @endo/trampoline today and kick that CI job again to see what happens. I’ll let you know.

}
}

// for (const [someDescriptorName, someDescriptor] of entries(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous implementation. Good catch; let’s delete this dead code.

sourceMapHook,
strictlyRequiredForCompartment,
},
{ maybeRead, parse, shouldDeferError = () => false },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be addressed in a refactor, so does not need to be made consistent in this change.

if (isAbsolute(moduleSpecifier)) {
let shouldCallExitModuleImportNowHook = true;
for (const [someDescriptorName, someDescriptor] of entries(
compartmentDescriptors,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@naugtur Please take a look at Chris’s new algorithm with the precomputed compartments set. I think the order of complexity is at least right now, as well as some correctness concerns when one package is nested in the node_modules of another.

/**
* A resolution of `undefined` indicates `ENOENT` or the equivalent.
*
* @callback MaybeReadSyncFn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but it will be harder to change later.

packages/compartment-mapper/src/node-modules.js Outdated Show resolved Hide resolved
@@ -724,6 +734,7 @@ const translateGraph = (
parsers,
types,
policy: /** @type {SomePackagePolicy} */ (packagePolicy),
compartments: compartmentNames,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found the money right here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note for later: this could be narrowed down to what's allowed by policy. TODO: create an issue

@boneskull boneskull force-pushed the boneskull/dynamic branch 2 times, most recently from 40efea3 to 26c4bf4 Compare October 1, 2024 19:53
@boneskull boneskull force-pushed the boneskull/dynamic branch 2 times, most recently from a740027 to 25724bf Compare October 1, 2024 20:51
This change adds support for _dynamic requires_, where a Node.js script does something like this:

```js
const someModuleSpecifier = getSpecifier();
const idk = require(someModuleSpecifier);
```

This behavior is not enabled by default; specific options and Powers are required. See the signature of `loadFromMap()` in `import-lite.js` for more information. Dynamic loading of exit modules (e.g., if `someModuleSpecifier` was `node:fs`) is handled by a user-provided "exit module import hook". Policy applies. Dynamic requires work as you'd probably expected _except_:

- A compartment may dynamically require from its *dependee*, since this is a common pattern in the ecosystem (see [node-gyp-build](https://npm.im/node-gyp-build)).
- The special "attenuators" compartment may not be dynamically required. Horsefeathers!

Some relevant bits, if you're mining history:

- All internal parsers are now _synchronous_, which was made possible by the introduction of `evadeCensorSync` in `@endo/evasive-transform`.
- Async parsers are still supported, but they would only be user-defined. Async parsers are incompatible with dynamic requires.
- Added property `{Set<string>} compartments` to `CompartmentDescriptor`. This is essentially a view into the compartment names within `CompartmentDescriptor.scopes`.
- The `mapParsers()` function has moved into `map-parser.js`.
- `@endo/trampoline` is leveraged in a couple places (`import-hook.js`, `map-parser.js`) to avoid code duplication.
- Introduced `FsInterface`, `UrlInterface`, `CryptoInterface` and `PathInterface` (and their ilk) for use with `makeReadPowers()` and the new `makeReadNowPowers()`.
@boneskull boneskull merged commit a74fcf3 into master Oct 1, 2024
16 checks passed
@boneskull boneskull deleted the boneskull/dynamic branch October 1, 2024 21:19
@naugtur
Copy link
Member

naugtur commented Oct 2, 2024

#2310 (comment)
I'd like to elaborate on that, preferably synchronously

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants