Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScalaJSWorkerImpl.ScalaJSLinker can cause OOM #4584

Open
HollandDM opened this issue Feb 18, 2025 · 11 comments
Open

ScalaJSWorkerImpl.ScalaJSLinker can cause OOM #4584

HollandDM opened this issue Feb 18, 2025 · 11 comments

Comments

@HollandDM
Copy link

HollandDM commented Feb 18, 2025

My company has about 10 ScalaJSModules, each of them is quite big on their own. For simplicity in CI/CD, we create a single task that fullLinkJS all 10 modules seqentially. Between the run, LinkInput does not changed, so ScalaJSWorkerImpl keeps using the same cached Linker and IRFileCache.Cache.
This causes OOM on our server because Scala.js's IRFileCache.Cache does not evict anything by itself, so the number of cached files in memory become too much.
One easy solution for this is to fullLinkJs each module in its own mill no-server process, but this makes our CI/CD pipeline more complicated.
IRFileCache.Cache implementation in Scala.js does provide a free method which can mitigate this problems. Can we somehow get the access to the methods? If not, can you guys provide other solutions for this problem?

@lolgab
Copy link
Member

lolgab commented Feb 18, 2025

One solution could be to stop caching the fullLinkJS linkers since usually you don't do incremental linking there.

@lihaoyi
Copy link
Member

lihaoyi commented Feb 19, 2025

I'm not sure, is it true that fullLinkJS doesnt have incremental linking? If so why does Scalajs offer caches at all? Perhaps @sjrd could chip in what we should do in Mill if the caches are getting too large?

@sjrd
Copy link
Contributor

sjrd commented Feb 19, 2025

fullLinkJS does have incremental linking, because it's the same pipeline as fastLinkJS. However, in practice we rarely invoke fullLinkJS several times in a row with small changes. So incrementality can be thrown away without much impact.

You can configure the linker to throw away its incremental state on every run with linkerConfig.withBatchMode(true). In an sbt project, to configure that for fullLinkJS, it would look like

Compile/fullLinkJS/scalaJSLinkerConfig ~= { _.withBatchMode(true) }

@lihaoyi
Copy link
Member

lihaoyi commented Feb 20, 2025

Thanks @sjrd . @lolgab it seems to solution here would be to let the user turn off incrementality with withBatchModue(true)? I think that would be preferable over hard-coding it off my default

@lolgab
Copy link
Member

lolgab commented Feb 20, 2025

@lihaoyi The only problem is that the usual lack of scopes we have in Mill. In Sbt you have fastLinkJS and fullLinkJS scopes and you can have separate settings. We don't have that in Mill, we can have def scalaJSFullLinkJSBatchMode: T[Boolean].
Maybe scopes (or something with their power) is something you could look at, now that you are revamping many things in Mill. Their absence is limiting with Scala.js

@lihaoyi
Copy link
Member

lihaoyi commented Feb 20, 2025

Yeah, without per-tasks scopes the state of the art would be to duplicate the task for fastLinkJS and fullLinkJS modes. So that would mean def scalaJSFullOptLinkerConfig and def scalaJSFastOptLinkerConfig.

Let's go with that for now. Some day we may have fancier stuff as discussed in #4595, but that's far enough into the future that we probably shouldn't wait for it

@lolgab
Copy link
Member

lolgab commented Feb 20, 2025

We don't have linkerConfig, but separate settings for every flag.
So it has to be:

def scalaJSFullLinkJSBatchMode: T[Boolean] = T { true }

or

def scalaJSBatchModeFullLinkJS: T[Boolean] = T { true }

fullOpt is the legacy name, the new name is fullLinkJS.

One thing I don't understand is why our SoftReference is not working as intended.
We used SoftReference so the cache could be freed by the JVM when the GC was suffering, but the user is experiencing OOM, which doesn't make sense.

@lolgab
Copy link
Member

lolgab commented Feb 20, 2025

@HollandDM Can you try publishing locally this commit: 51bebc7 and see if it makes a difference in your project?

One thing I don't understand is why our SoftReference is not working as intended.
We used SoftReference so the cache could be freed by the JVM when the GC was suffering, but the user is experiencing OOM, which doesn't make sense.

It seems that the problem is that we keep a global IRFileCache and we never call .free() on the individual IRFileCache.Cache instances and even though we garbage collect them using SoftReferences, the global one keeps all the IRFiles in memory resulting in OOM.

@lihaoyi
Copy link
Member

lihaoyi commented Feb 20, 2025

Having individual def scalaJSFullLinkJSBatchMode: T[Boolean] = T { true } or def scalaJSFastLinkJSBatchMode: T[Boolean] = T { true } flags that people can override sounds reasonable to me

@lefou
Copy link
Member

lefou commented Feb 20, 2025

IMHO, we should move the linking toolchain into it's own trait. This follows our previous generalizations done in TestModule, RunModule or AssembleModule. We could then pre-define a fast and a full instance, but users would be free to provide more setups if they need to.

@lihaoyi
Copy link
Member

lihaoyi commented Feb 21, 2025

The issue with using a sub-module for fast and full is that making stuff overridable and thenoverriding it in submodules is very annoying: you need to replace the object with a trait ScalaJsLinkerModule and a lazy val full: = new ScalaJsLinkerModule{}; lazy val fast: = new ScalaJsLinkerModule{}. Not the end of the world, and we've done it in a few places, but it's somewhat gnarly and not something we tend to want to do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants