Skip to content

8358680: AOT cache creation fails: no strings should have been added #25816

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

iklam
Copy link
Member

@iklam iklam commented Jun 16, 2025

Background: when writing the string table in the AOT cache, we do this:

  1. Find out the number of strings in the interned string table
  2. Allocate Java object arrays that are large enough to store these strings. These arrays are used by StringTable::lookup_shared() in the production run.
  3. Enter safepoint
  4. Copy the strings into the arrays

This bug happened because:

  • Step 1 is not thread safe, so it may be reading a stale version of _items_count
  • JIT compiler threads may create more interned strings after step 1

This PR attempts to fix both issues.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8358680: AOT cache creation fails: no strings should have been added (Bug - P4)

Contributors

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25816/head:pull/25816
$ git checkout pull/25816

Update a local copy of the PR:
$ git checkout pull/25816
$ git pull https://git.openjdk.org/jdk.git pull/25816/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25816

View PR using the GUI difftool:
$ git pr show -t 25816

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25816.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 16, 2025

👋 Welcome back iklam! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jun 16, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 16, 2025
@openjdk
Copy link

openjdk bot commented Jun 16, 2025

@iklam The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

mlbridge bot commented Jun 16, 2025

Webrevs

@iklam
Copy link
Member Author

iklam commented Jun 16, 2025

@shipilev I copied the part of your JDK-8352042: [leyden] Parallel precompilation changes from the Leyden repo that implements CompileBroker::wait_for_no_active_tasks().

@iklam
Copy link
Member Author

iklam commented Jun 16, 2025

/contributor add @shipilev

@openjdk
Copy link

openjdk bot commented Jun 16, 2025

@iklam
Contributor Aleksey Shipilev <[email protected]> successfully added.

@vnkozlov
Copy link
Contributor

Can something else add string to table between you get number and safe point? How you avoid that?
I assume you can't get number at safepoint because you need to allocate object array not at safepoint. Right?

@iklam
Copy link
Member Author

iklam commented Jun 16, 2025

Can something else add string to table between you get number and safe point? How you avoid that? I assume you can't get number at safepoint because you need to allocate object array not at safepoint. Right?

Not that I am aware of. In testing, I added precond(THREAD->is_Java_thread() into StringTable::intern() and it never failed. So I assume that this function can only be called by real Java threads or compiler threads.

@iklam
Copy link
Member Author

iklam commented Jun 16, 2025

Can something else add string to table between you get number and safe point? How you avoid that? I assume you can't get number at safepoint because you need to allocate object array not at safepoint. Right?

Not that I am aware of. In testing, I added precond(THREAD->is_Java_thread() into StringTable::intern() and it never failed. So I assume that this function can only be called by real Java threads or compiler threads.

Here are all the subclasses of JavaThread:

class AttachListenerThread : public JavaThread {
class CompilerThread : public JavaThread {
class DeoptimizeObjectsALotThread : public JavaThread {
class JvmtiAgentThread : public JavaThread {
class MonitorDeflationThread : public JavaThread {
class NotificationThread : public JavaThread {
class ServiceThread : public JavaThread {
class StringDedupThread : public JavaThread {
class TrainingReplayThread : public JavaThread {

StringDedupThread is disabled when dumping heap objects. And the other threads probably will either be inactive, or won't be doing anything that would result in string interning.

I guess someone could use jcmd to connect to the JVM at the wrong time and that might upset the archive assembly process. But that would create far worse problems than the bug that we are trying to fix here.

@shipilev
Copy link
Member

shipilev commented Jun 16, 2025

@shipilev I copied the part of your JDK-8352042: [leyden] Parallel precompilation changes from the Leyden repo that implements CompileBroker::wait_for_no_active_tasks().

Well, I am rewriting that part in openjdk/leyden#79. I can give you a hunk on top of this PR that would do the same thing: use the CompileTaskWait_lock instead.

@shipilev
Copy link
Member

Actually, I need #25409 in mainline first :)

@iklam
Copy link
Member Author

iklam commented Jun 16, 2025

Actually, I need #25409 in mainline first :)

I can wait for that for JDK 26.

If I want to fix for JDK 25, will the current fix be good enough?

@shipilev
Copy link
Member

Would be nice to avoid divergence between JDK 25, JDK 26 and Leyden/premain.

Anyhow, I now think this fix in incomplete. In premain, we use this wait_for_no_active_tasks, because we know all the compiler tasks were queued, and we just need to run them down. But here, we still have a race: compiler threads may finish current batch of compilations, wait_for_no_active_tasks would return, and then we can start compiling again.

I think we need to figure our when we dump the shared table. Maybe even shutdown the compiler right before going into CDS dump? See how CompileBroker::set_should_block and VM_Exit::wait_for_threads_in_native_to_block do it. It would still be pretty awkward for a fix.

@iklam
Copy link
Member Author

iklam commented Jun 17, 2025

Would be nice to avoid divergence between JDK 25, JDK 26 and Leyden/premain.

Anyhow, I now think this fix in incomplete. In premain, we use this wait_for_no_active_tasks, because we know all the compiler tasks were queued, and we just need to run them down. But here, we still have a race: compiler threads may finish current batch of compilations, wait_for_no_active_tasks would return, and then we can start compiling again.

No Java code is executing at this point (we are in the only thread that can run Java code). Is there still a possibility for new compile tasks to be added?

I think we need to figure our when we dump the shared table. Maybe even shutdown the compiler right before going into CDS dump? See how CompileBroker::set_should_block and VM_Exit::wait_for_threads_in_native_to_block do it. It would still be pretty awkward for a fix.

In Leyden, we run the AOT compiler after the CDS dumping has finished, so if we shut down the compiler we would have to restart it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot [email protected] rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

3 participants