Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8347422: Crash during safepoint handler execution with -XX:+UseAPX #23035

Closed
wants to merge 2 commits into from

Conversation

jatin-bhateja
Copy link
Member

@jatin-bhateja jatin-bhateja commented Jan 10, 2025

This bug fix patch fixes an internal error seen during safepoint handler execution.
The problem occurs due to missing handling for REX2 prefixed polling test instruction.

Manually verified the patch with the -XX:+SafepointALot runtime flag.

Best Regards,
Jatin
PS: Patch will be opened for review after some validation.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8347422: Crash during safepoint handler execution with -XX:+UseAPX (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23035/head:pull/23035
$ git checkout pull/23035

Update a local copy of the PR:
$ git checkout pull/23035
$ git pull https://git.openjdk.org/jdk.git pull/23035/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 23035

View PR using the GUI difftool:
$ git pr show -t 23035

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23035.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 10, 2025

👋 Welcome back jbhateja! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 10, 2025

@jatin-bhateja This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8347422: Crash during safepoint handler execution with -XX:+UseAPX

Reviewed-by: kvn, sviswanathan, thartmann

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 216 new commits pushed to the master branch:

  • 2ca1b4d: 8347840: Fix testlibrary compilation warnings
  • 2c41f5a: 8347959: ThreadDumper leaks memory
  • 15e06b6: 8345618: javax/swing/text/Caret/8163124/CaretFloatingPointAPITest.java leaves Caret is not complete
  • 3ab521e: 8347924: Replace SIZE_FORMAT in memory and metaspace
  • 8460072: 8342881: RISC-V: secondary_super_cache does not scale well: C1 and interpreter
  • f5573f5: 8330851: C2: More efficient TypeFunc creation
  • a3eef6c: 8346875: Test jdk/jdk/jfr/event/os/TestCPULoad.java fails on macOS
  • 398a580: 8347911: Limit the length of inflated text chunks
  • aa21de5: 8347481: C2: Remove the control input of some nodes
  • 1f365cc: 8346433: Cannot use DllMain in hotspot for static builds
  • ... and 206 more: https://git.openjdk.org/jdk/compare/84e6432bb73e35b32f12cdc0e1a172b7c973e618...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Jan 10, 2025

@jatin-bhateja The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@jatin-bhateja jatin-bhateja changed the title 8347422: Crash during safepoint handler execution 8347422: Crash during safepoint handler execution with APX Jan 10, 2025
@jatin-bhateja jatin-bhateja changed the title 8347422: Crash during safepoint handler execution with APX 8347422: Crash during safepoint handler execution with -XX:+UseAPX Jan 10, 2025
@jatin-bhateja jatin-bhateja marked this pull request as ready for review January 13, 2025 09:47
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 13, 2025
@mlbridge
Copy link

mlbridge bot commented Jan 13, 2025

Webrevs

@TobiHartmann
Copy link
Member

Hi @jatin-bhateja, please see my questions in JBS. Should we add a regression test for this?

@jatin-bhateja
Copy link
Member Author

jatin-bhateja commented Jan 13, 2025

Hi @jatin-bhateja, please see my questions in JBS. Should we add a regression test for this?

Hi Tobias, To stress the validation I generally give preference to EGPRs by changing the static allocation order defined in AD file. We already have regressions in place for this in the following directory.
https://github.com/openjdk/jdk/tree/master/test/hotspot/jtreg/compiler/c2/cr6340864
Please make sure the use debug build to verify the fix.

@TobiHartmann
Copy link
Member

Thanks Jatin. In this case I would suggest that we should add a VM stress flag to give preference to EGPRs, or maybe JDK-8343294 is good enough?

Do the existing tests trigger this even without -XX:+SafepointALot?

@jatin-bhateja
Copy link
Member Author

jatin-bhateja commented Jan 13, 2025

Thanks Jatin. In this case I would suggest that we should add a VM stress flag to give preference to EGPRs, or maybe JDK-8343294 is good enough?

Technically, randomizing allocation sequence from the same register class is not very useful from the compiler standpoint as it simply ensures that an LRG corresponding to the definition MachOper is never assigned the same register which is already assigned to its neighbors in the interference graph, so it simply picks the next available free register while choosing the color, a change in allocation order will not impact the spilling behavior either, what I am doing is modifying the static allocation order so that we deterministically give preferences to EGPRs over GPRs, this will ensure that all our APX specific assembler support and special handling like the one extended by this patch as exercised throughly.

Do the existing tests trigger this even without -XX:+SafepointALot?

Yes, but with -XX:+SafepointALot crash hit early.

@TobiHartmann
Copy link
Member

TobiHartmann commented Jan 13, 2025

Technically, randomizing allocation sequence from the same register class is not very useful

It's definitely useful to catch bugs where silent register corruption is never detected because that exact register is not used anywhere else. Or for cases where a specific register is not saved although it should be but we don't notice because that register is not used anywhere else. And we had quite a few bugs like this in the past.

Wouldn't such a random register selection also randomly prioritize EGPRs and therefore do the same thing that you did manually? And if not, doesn't that mean that we should definitely add such a stress flag?

Yes, but with -XX:+SafepointALot crash hit early.

Okay, good then.

@jatin-bhateja
Copy link
Member Author

jatin-bhateja commented Jan 13, 2025

Technically, randomizing allocation sequence from the same register class is not very useful

It's definitely useful to catch bugs where silent register corruption is never detected because that exact register is not used anywhere else. Or for cases where a specific register is not saved although it should be but we don't notice because that register is not used anywhere else. And we had quite a few bugs like this in the past.

I see, Thanks @TobiHartmann, scope of JDK- 8343294 can be extended to cover EGPRs to catch such silent bugs, to stress validating all JVM components I am taking a brute force approach to begin with. You can assign JDK-8343294 to me if Daniel is not active on that.

Wouldn't such a random register selection also randomly prioritize EGPRs and therefore do the same thing that you did manually? And if not, doesn't that mean that we should definitely add such a stress flag?

Yes, but with -XX:+SafepointALot crash hit early.

Okay, good then.

Let me know if this looks ok for now.

@dlunde
Copy link
Member

dlunde commented Jan 13, 2025

Hi @jatin-bhateja. Feel free to have a look at JDK-8343294 (and assign yourself). Please ping me for the review when the changeset is ready!

@jatin-bhateja
Copy link
Member Author

jatin-bhateja commented Jan 16, 2025

Hi @TobiHartmann , Please let me know if fix looks ok, JDK-8343294 will be a separate RFE.

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me but I'm not an expert in this area so I leave it to someone else to approve.

/reviewers 2

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 16, 2025
@openjdk
Copy link

openjdk bot commented Jan 16, 2025

@TobiHartmann
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 1 Reviewer, 1 Author).

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Jan 16, 2025
Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have few comments.

Comment on lines 3123 to 3129
// is why two byte encoding is sufficient here.
// o For safepoint polling instruction like "test %eax,(%r8)", register encoding of BASE
// register of memory operand is 1000, thus we need additional REX prefix in this case,
// there by adding additional byte to instruction encoding.
// o In case BASE register is one of the 32 extended GPR registers available only on targets
// supporting Intel APX extension, then we need to emit two byte REX2 prefix to hold
// most significant two bits of 5 bit register encoding.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change two byte -> two bytes in two places.


if (VM_Version::supports_apx_f()) {
__ cmpb(Address(rbx, 0), Assembler::REX2);
__ jcc(Assembler::notEqual, check_rex_prefix);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we not using short branch jccb here and in following code? We skip only addptr instruction.

On other hand jccb(Assembler::notEqual, no_adjust) at line 3093 above may be far from target after new code is added. May be use jcc there.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, jccb seems more appropriate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, there were two other occurrences in near vicinity in the same stubs.
Unlike backward jumps, forward jumps are not auto demoted to short variants by assembler.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Copy link

@sviswa7 sviswa7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 17, 2025
@jatin-bhateja
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented Jan 17, 2025

Going to push as commit 6cc1c0a.
Since your change was applied there have been 217 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 17, 2025
@openjdk openjdk bot closed this Jan 17, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 17, 2025
@openjdk
Copy link

openjdk bot commented Jan 17, 2025

@jatin-bhateja Pushed as commit 6cc1c0a.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler [email protected] integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants