Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memoized resolve #4323

Closed
wants to merge 2 commits into from
Closed

Conversation

tvald
Copy link
Contributor

@tvald tvald commented Aug 22, 2017

Summary

Memoize isFile() within default_resolver. Add memoized isDirectory(). Check that a directory exists before attempting to resolve files within that directory.

This provides a huge performance boost to module resolution.

Test plan

…checks before attempting to resolve within a directory (eg, node_modules).
@facebook-github-bot
Copy link
Contributor

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

const memoIsFile: {[Path]: boolean} = {};
function isFile(file: Path): boolean {
let result = memoIsFile[file];
if (result != null) return result;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result !== undefined should perform slightly better, same in isDirectory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 93a7074.

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@@ -85,16 +87,6 @@ function resolveSync(x: Path, options: ResolverOptions): Path {
/*
* helper functions
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a leftover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two sections of helper functions - those within the resolveSync() closure, and the two memoized helpers isFile and isDirectory.

@cpojer
Copy link
Member

cpojer commented Aug 22, 2017

Oh this is super awesome, I'm so glad you are working on this. I'm however afraid that this will not work for things like Jest's watch mode because files may change over time of Jest's usage and the resolution algorithm should work with what is currently on the system, not what used to be there in the past.

To solve this, the Resolver class is able to be instantiated. We can merge the resolution logic into the base class and use a cache within there, and each test will receive it's own copy of the resolver, which will be recreated for individual test runs.

This will not share the resolution logic between two individual test files, so won't be as fast as it could get, but within the dependency graph of a single test, the performance optimization should still apply. We can also change this to memoize for the duration of a single test run, and that would be fine with me, but there is no concept in the worker process that manages the lifetime of a full test run, so we'd have to introduce those hooks somewhere in the workers, which is hard because they are supposed to be stateless. For now, I'd encourage moving all the memoization logic into the Resolver class. Does that make sense to you?

@tvald
Copy link
Contributor Author

tvald commented Aug 22, 2017

Moving discussion back to #2925, since this will take several stages and multiple PRs. I'm not familiar enough with Jest's codebase to see an easy path beyond the memoization within a single test which you describe. Unfortunately, I don't think that alone will provide enough of a performance boost for Jest to be usable with our codebase.

@tvald tvald closed this Aug 22, 2017
@niieani
Copy link
Contributor

niieani commented Jan 6, 2018

Hi all!

After doing some profiling, I've independently discovered that memoizing the resolver gives a huge speed-up. Simply adding lodash.memoize to the default_resolver function decreases the run time of our tests by 3x! You've read that right, that's 15 seconds instead of 45 seconds. Granted, our tests have a huge amount of dependencies, so your mileage may vary, but still -- I currently think the resolver is one of the low hanging fruit optimizations that could massively decrease jest's execution time.

@niieani
Copy link
Contributor

niieani commented Jan 6, 2018

All that's needed to make a custom, memoizing resolver based on the default one:

const {default: defaultResolver} = require('jest-resolve/build/default_resolver')
const hasher = require('node-object-hash')({sort: false, coerce: false})
const memoize = require('lodash.memoize')

function hashArgs() { return hasher.hash(arguments) }
module.exports = memoize(defaultResolver, hashArgs)

@sebinsua
Copy link

sebinsua commented Jan 6, 2018

@niieani Where should this code be placed? I would like to try this out. :)

@niieani
Copy link
Contributor

niieani commented Jan 6, 2018

Hi @sebinsua:

  1. make a simple package with the above cited file
  2. install lodash.memoize, node-object-hash and jest-resolve dependencies to it
  3. in your jest config, add "resolver": "path/to/your/resolver"

@lukeapage
Copy link
Contributor

I tried it out on our code-base (730 suites, takes 90 seconds after babel cache is warm). It didn't save any time at all (in fact slightly slower with the cache?) - the resolver was running and it was caching alot of things (I added logging to check) but it seems like a significant amount of time wasn't spent resolving.

@niieani
Copy link
Contributor

niieani commented Jan 7, 2018

@lukeapage that's interesting. It's probably going to vary depending on the codebase. We have a monorepo with lots of projects that are run through jest, all as jest projects, plus a lot of the dependencies are shared. Additionally, our dependency tree is huge (some tests can traverse hundreds of dependencies). Probably that's why the gains on our end are so visible.

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants