Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix occasional failed imports due to race conditions #270

Merged
merged 2 commits into from
Mar 9, 2024

Conversation

generalmimon
Copy link
Member

@generalmimon generalmimon commented Mar 6, 2024

Fixes kaitai-io/kaitai_struct#951

Until now, KSC had occasional problems with resolving imported top-level types, reporting unable to find type '<imported>' errors from time to time (see kaitai-io/kaitai_struct#951 for more details and reproduction instructions). It turned out that the importing code ran on multiple threads that were concurrently modifying/reading shared non-thread-safe HashMaps without any synchronization, which resulted in race conditions.

I thought it would be best to eliminate the race conditions by switching to some thread-safe counterpart to mutable.HashMap. After some googling, it turned out that there are two viable options, scala.collection.concurrent.TrieMap and java.util.concurrent.ConcurrentHashMap.

However, TrieMap doesn't seem to be implemented for Scala.js (scala-js/scala-js#4866) and it is final, so we cannot use it as a base class of ClassSpecs instead of mutable.HashMap that we're using now. ConcurrentHashMap (despite being in the java.util.concurrent.* package, suggesting that it would only be available on JVM) surprisingly appears to be available in Scala.js (scala-js/scala-js#1487), so in theory we could use it even in shared/src/... code, but I haven't figured out how to make ClassSpecs extend from it. It must be added that since Scala 2.13, the inheritance from HashMap was deprecated, so we'll have to find another way, but for now it works.

The unsynchronized hash map accesses in the JavaClassSpecs.cached() are in JVM-specific code (jvm/src/... folder), so I've just changed the private relFiles and absFiles fields to use ConcurrentHashMap, thus resolving the race conditions here. (TrieMap would work too - there's probably no practical difference for our use case.)

For the LoadImports.loadImport() method, I've just wrapped the accesses to shared ClassSpecs in a manual synchronized block. According to some on the internet, using synchronized is kind of discouraged due to being error-prone in favor of using existing thread-safe types or immutable types (Scala generally seems to prefer immutable types, but I can't imagine how they could be used in this case), but I believe it's the easiest way to solve the problem here.

@generalmimon generalmimon force-pushed the import-race-conditions branch 2 times, most recently from 37450f8 to 6946bb1 Compare March 7, 2024 10:21
@GreyCat GreyCat self-requested a review March 8, 2024 19:49
Copy link
Member

@GreyCat GreyCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into this crazy complicated parallel issue! Read all your investigation like a good detective story — including Python script to generate repro :)

Looks like a very good solution to me!

Fixes kaitai-io/kaitai_struct#951

Until now, KSC had occasional problems with resolving imported top-level
types, reporting `unable to find type '<imported>'` errors from time to
time (see kaitai-io/kaitai_struct#951 for more
details and reproduction instructions). It turned out that the importing
code ran on multiple threads that were concurrently modifying/reading
shared non-thread-safe `HashMap`s without any synchronization, which
resulted in race conditions.

I thought it would be best to eliminate the race conditions by switching
to some thread-safe counterpart to `mutable.HashMap`. After some
googling, it turned out that there are two viable options,
[`scala.collection.concurrent.TrieMap`](https://www.scala-lang.org/api/2.13.13/scala/collection/concurrent/TrieMap.html)
and
[`java.util.concurrent.ConcurrentHashMap`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html).

However, `TrieMap` doesn't seem to be implemented for Scala.js
(scala-js/scala-js#4866) and it is `final`, so
we cannot use it as a base class of `ClassSpecs` instead of
`mutable.HashMap` that we're using now. `ConcurrentHashMap` (despite
being in the `java.util.concurrent.*` package, suggesting that it would
only be available on JVM) surprisingly appears to be available in
Scala.js (scala-js/scala-js#1487), so in
theory we could use it even in `shared/src/...` code, but I haven't
figured out how to make `ClassSpecs` extend from it. It must be added
that since Scala 2.13, the inheritance from `HashMap` was deprecated, so
we'll have to find another way, but for now it works.

The unsynchronized hash map accesses in the `JavaClassSpecs.cached()`
are in JVM-specific code (`jvm/src/...` folder), so I've just changed
the private `relFiles` and `absFiles` fields to use `ConcurrentHashMap`,
thus resolving the race conditions here. (`TrieMap` would work too -
there's probably no practical difference for our use case.)

For the `LoadImports.loadImport()` method, I've just wrapped the
accesses to shared `ClassSpecs` in a manual `synchronized` block.
According to some on the internet, using `synchronized` is kind of
discouraged due to being error-prone in favor of using existing
thread-safe types or immutable types (Scala generally seems to prefer
immutable types, but I can't imagine how they could be used in this
case), but I believe it's the easiest way to solve the problem here.
@generalmimon generalmimon force-pushed the import-race-conditions branch from 6946bb1 to 1a71310 Compare March 8, 2024 21:47
@GreyCat GreyCat merged commit 0c1bcdd into master Mar 9, 2024
3 of 6 checks passed
@GreyCat
Copy link
Member

GreyCat commented Mar 9, 2024

Hey @generalmimon, I went ahead and merged this one, hope you're ok with that :)

@generalmimon generalmimon deleted the import-race-conditions branch March 9, 2024 22:54
@generalmimon
Copy link
Member Author

@GreyCat:

Hey @generalmimon, I went ahead and merged this one, hope you're ok with that :)

Yes, thanks. I just let it sit for a while to see if you would look at it again ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kaitai_struct compiler has sporadic problems importing kaitai files and compiling (python) code
2 participants