-
-
Notifications
You must be signed in to change notification settings - Fork 116
Make npmInstall and npmSetup tasks cacheable #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Long story short, those tasks don't cache very well, however, if you can get a local mirror of node and npm you should be able to get the same experience for And if you're using webpack or similar you can set up a task for that with inputs/outputs and cache it, which should help speed things up |
Let's have a look to these tasks (I add
All those tasks are not directly cacheable by Gradle, but they rely on tools that use cache. I know this have some limitations. For instance, if the build runs inside a container whose file system is not persistent, the cache will not be reused for next builds by default. This will be also the case for Gradle, but this can be fixed by changing its cache configuration. It would be necessary to do the same thing for also for npm and it is much simpler if all the cache is managed by Gradle. Once the Gradle cache is configured, everything can be cached. We can assume (it would be interesting to test it though), that
I think we should make some experiments to ensure that make them cacheable really makes sense and does not create some unwanted side effects. |
Luckily even if you run on an entire clean machine |
Thanks for replying. You are right,
What do you think of that solution? It seems to be working, but I'm a little afraid about the possible problems you've mentioned above. |
Hi, I've come up with the following code and it works for me: npmInstall {
inputs.files(project.files('package.json'))
outputs.dir("${project.buildDir}/node_modules/")
}
task buildUI(type: NpmTask, dependsOn: npmInstall) {
inputs.files(project.files('src'))
outputs.dir("${project.buildDir}/dist/")
args = ['run', 'build']
}
build.dependsOn buildUI I'm running the |
@nukesz normally you don't need to configure the I also would advice you to add the @nukesz, you are not talking about the same thing as @bercik. This issue is about caching, not just task execution avoidance. Task execution avoidance is the fact of skipping the task execution if the inputs did not change since last build and the outputs are still present and did not change. Caching is one step further. As Gradle knows which are the input and outputs of the task, it can save the output in a cache (which can be shared between multiple computers) and reuse the same output stored in the cache instead of running the task when the inputs are identical. |
@bercik is your experiment successful? Does it work as expected? Do you use only local cache or do you share the cache between multiple hosts? |
I would recommend merging npmInstall and build tasks. For example by using in
This gives a single task in Gradle. So if the result (the compiled webapp) is already available, the build server can jump over setting up the node_modules directory. Which can save a lot of time. And |
@remmeier if the compiled web app is already available then both Though in your case there's a caveat with |
does (there is a second new npm-based project on my side, i probably have some more insights soon in that regard) |
I did some experiments in another issue to know which one of Gradle up-to-date checking or I did not do this experiment with |
And it sounds like Here is the first line that is printed by
|
I will have a look later and file a bug report if it really does not to up-to-date checking. I really like yarn & Gradle... npmInstall and gradle cache restore will heavily depend on where the data is coming from. yarn is able to cache it locally, so it can be fast. The same for gradle. A close-by Gradle remote cache is also not so bad. But the public NPM registry should be avoided at all cost I think for anybody in need of r robust/fast builds. Your Gradle cache results look good for sure. |
the one things really good at yarn is that |
I don't think realistically it can ever beat gradle doing |
I did this experiment again using an Angular project with a multiple dependencies (902 packages at Note that I used the version 3 of the plugin (which will be released soon) because I discovered an issue in the version 2.2.3 regarding the
We can see that |
I could investigate the issue I encountered, and it appears that the up-to-date checking of the yarn task works as expected, I probably did an error in my experiment. But I discovered that the I created the issue #96 and fixed it. It now also takes 1 second with the version 2.2.3 and the fix. |
What about Yarn tasks ( YarnSetupTask, YarnInstallTask, YarnTask)? Can build cache be enabled for them? |
Your own ad-hoc |
The problem is that To fix caching, add this to your root
|
There's some other workarounds in gradle/gradle#3525 as well, but with symbolic links being resolved to their file instead of being stored I'm not sure we ever could support the build cache in those tasks. So if you're using |
Bumping the thread, since the question I've run into seems to be related. On Gradle 7.4,
This is shown only on Windows - on Linux, even in WSL, there's no problem. I guess the source of the error is the symbolic links in |
@Cerber-Ursi I'm unable to reproduce that issue using the example project, with or without download enabled, with or without a Can you open a separate issue and provide a standalone project that reproduces the issue? |
in my experience since my last update, it's better to never cache |
Interesting. I've tried to do this again and wasn't able to reproduce the original problem either, including in the real project where it was initially noticed. Will try again later, not sure what exactly has changed in between. |
EDIT: I made a mistake, the information below is incorrect.
tasks.npmInstall {
outputs.cacheIf { true }
}
val npmRunBuild by tasks.registering(NpmTask::class) {
npmCommand.set(listOf("run", "build"))
inputs.files("package.json", "webpack.config.js")
.withPathSensitivity(RELATIVE)
inputs.dir("src/main")
.withPathSensitivity(RELATIVE)
inputs.dir(layout.projectDirectory.dir("node_modules"))
.withPathSensitivity(RELATIVE)
outputs.dir("dist")
}
tasks.npmInstall {
outputs.file(layout.projectDirectory.file("node_modules/.package-lock.json"))
.withPropertyName("nodeModulesPackageLock")
outputs.cacheIf { true }
}
|
I don't think this does what you think it does. Unless
|
You're right! It wasn't caching as I expected, I misunderstood what was going on. |
Hi 👋 I've add an example test of the behaviour I'd like to see in #301 Currently I see warning messages from Gradle that explain why
From a quick look removing these properties as outputs should work
Do these changes seem reasonable? I'm not familiar with NPM development so I would appreciate any insights. |
Conceptually you've got the right idea @adam-enko, but there's a few things that makes Let's start with the easy one, Gradle currently breaks symlink and this makes Next one is harder because I don't have the data any longer and never thought to create benchmarks, but when your These might be fine locally, but if you have a remote build cache then publishing a 1-2GB And with a remote build cache, what happens if you pull a So to list the problems I remember:
With these in mind, most people who provided feedback (and most people polled) preferred the solution where Gradle mostly leaves everything to gradle-node-plugin/src/main/kotlin/com/github/gradle/node/NodeExtension.kt Lines 127 to 134 in 16ecc0f
So to wrap this up, do you really want to cache |
This doesn't mean there's no use for this, it's just that all evidence I know of suggests that people think they want this but they really might want something else There might be a legitimate use-case for this, or you might want to look into your build and consider the setup |
Thanks for the detailed response @deepy! It's a lot more clear to me now. I'm working on Dokka - which uses node-gradle in one subproject https://github.com/Kotlin/dokka/blob/e502b2c1b1aa168bb3cd10d4d5f3ab883daffabd/dokka-subprojects/plugin-base-frontend/build.gradle.kts The main issue is that I have made some changes in Kotlin/dokka#3479 that I expect will help. Regarding the caching of
|
Has https://docs.gradle.org/8.5/userguide/custom_tasks.html#sec:storing_incremental_task_state
|
@adam-enko That's an interesting proposal, but is not the directory too big for that? IIRC the default artifact size limit for caching is 100MB. |
Haven't had the time to look into this properly yet, but on Windows I get a 218MB Good news is we can easily test that by creating a snapshot and comparing the builds |
But before that, can you add the |
That's apparently |
The npmInstall task is non-cachable node-gradle/gradle-node-plugin#81, and being always activated even for non-relevant task execution paths. Force it to stop being activated during the offline mode. KTI-2149
The npmInstall task is non-cachable node-gradle/gradle-node-plugin#81, and being always activated even for non-relevant task execution paths. Force it to stop being activated during the offline mode. KTI-2149
The npmInstall task is non-cachable node-gradle/gradle-node-plugin#81, and being always activated even for non-relevant task execution paths. Force it to stop being activated during the offline mode. KTI-2149
I've been thinking about this a lot recently, and I think there's a way to use Gradle to cache NPM's dependencies, meaning while IIUC Gradle has a really good dependency caching system (even if only supports Maven and Ivy schemas). So what if we used Gradle to cache the NPM dependencies?
Gradle will still ignore What do you think, could it help? |
It's an interesting solution, the immediate thing that comes to mind is that modeling this might be a little trickier, thinking especially of how to propagate the output of But at a glance it looks like this is going to either need to run And the majority of the benefits would rely on the local cache being available (or pre-seeded, which seems more likely) I'm on a train right now and my stop is coming up, so I'll get back to this a bit later It doesn't make the tasks cacheable, but makes |
I've looked a lot more into this, and I think the focus on making the npm precursor tasks ( (Also, my last message about caching dependencies via Gradle would help, but it's a separate improvement that is unrelated to the current thread, so I'll try to make a new issue.) ScenarioThe scenario that I think highlights the problem is a project that's built on GitHub CI with Gradle build cache enabled. In terms of features it's the 'best case' scenario, because build cache is enabled and supported. However, because it's a fresh checkout, no local files are available. The tasks must either run, or be loaded from build cache. And because it's a fresh checkout Let's say there's a custom // build.gradle.kts
val npmBuild by tasks.registering(NpxTask::class) {
dependsOn(tasks.npmInstall)
command.set("someCmd")
inputs.dir("src")
outputs.file("build/result.js")
outputs.cacheIf { true }
} Even though
Just to clarify, I'm not criticising. I think this is the most optimal set up possible at present. However, here's my big idea: what if the precursor tasks weren't tasks? I think there's an alternative... Setting up without tasks?Recently an ex-Gradler came up with a tool for installing external tools https://github.com/jjohannes/gradle-demos/tree/main/toolchain-management/sample. It uses some internal Gradle tools to make downloading and caching more convenient, but I don't think that's relevant here (the publicly available API can work just fine, even if they're a little less convenient). The most interesting thing to me is how toolchain-management avoids installing the tool unless the task that uses the task actually runs. It does this by using a BuildService that I found this a bit confusing at first, because wouldn't you want to make sure the tool is installed? But actually, it makes a lot more sense in terms of work avoidance. Code exampleTo give a quick code-based demo, here's how it works currently: val npmSetup by tasks.registering {
description = "Installs node exe. Cannot be cached."
}
val npmInstall by tasks.registering {
description = "Installs npm dependencies into node_modules dir. Cannot be cached."
}
val actualNpmTaskThatDoesSomething by tasks.registering(NpmTask::class) {
dependsOn(npmSetup, npmInstall)
// Potentially cachable
} Because Instead, what if NpmTask would set up the executables and run abstract class NpmTask2 : NpmTask() {
@TaskAction
fun exec2() {
npmSetup() // Checks if npm exe is installed, downloading and installing it if necessary.
npmInstall() // Runs `npm install` if necessary (i.e. `npm install --offline` fails).
// now run the actual command
runNpmCommand(...)
}
}
val actualNpmTaskThatDoesSomething by tasks.registering(NpmTask::class) {
} This skips over a lot of potential issues (e.g. what about tasks running in parallel? How to check if the executables are installed and up to date?). But hopefully it gives a good high level idea? I think this results is a more idiomatic usage of Technical solutionI'm running a bit short on time so I'll keep this section short, but tl;dr let's make a A ValueSource will permit running npm commands in Gradle's configuration phase, and also make it possible to run multiple npm command during a single Task. |
Currently npmInstall and npmSetup tasks are not cacheable by gradle build cache, thus when running CI pipelines on different machines those tasks needs to fully run every time which consumes a lot of time.
Please consider making those tasks cacheable. Instructions can be found here:
https://docs.gradle.org/current/userguide/build_cache.html#enable_caching_of_non_cacheable_tasks
The text was updated successfully, but these errors were encountered: