Skip to content

8335409: Can't allocate and retain memory from resource area in frame::oops_interpreted_do oop closure after 8329665 #20012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

pchilano
Copy link
Contributor

@pchilano pchilano commented Jul 3, 2024

The ResourceMark added in 8329665 to address the case of having to allocate extra memory for the _bit_mask, prevents code in the closure from allocating and retaining memory from the resource area across the closure, relying on some ResourceMark in scope further up the stack from frame::oops_interpreted_do(). There is in fact one case today in JFR code where this kind of allocation happens.

The amount of locals and expression stack entries a method can have before having to allocate extra memory for the _bit_mask is 4*64/2 = 128. This is already big enough that we almost never have to allocate. A test run through mach5 tiers1-6 shows only a handful of methods that fall into this case, and most are artificial ones created to trigger this condition. So moving the allocation to the C heap shouldn't have any performance penalty as the comment otherwise says. This comment dates back from 2002 where instead of 128 entries we could have only 32, considering 32 bits cpus as still in main use (see bug for more history details).

The current code in InterpreterOopMap::resource_copy() has a comment expecting the InterpreterOopMap object to be recently created and empty, but it also has an assert in the allocation case path where it considers the entry might be in use already. This assert actually looks wrong since a used InterpreterOopMap object will not necessarily contain a pointer to resource area memory in _bit_mask[0]. I added an example case in the bug details. In any case, since we don't have any such cases in the codebase I added an explicit assert to verify each InterpreterOopMap is only used one.

I tested the patch by running it through mach5 tiers 1-6.

Thanks,
Patricio


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8335409: Can't allocate and retain memory from resource area in frame::oops_interpreted_do oop closure after 8329665 (Bug - P2)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20012/head:pull/20012
$ git checkout pull/20012

Update a local copy of the PR:
$ git checkout pull/20012
$ git pull https://git.openjdk.org/jdk.git pull/20012/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 20012

View PR using the GUI difftool:
$ git pr show -t 20012

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20012.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 3, 2024

👋 Welcome back pchilanomate! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 3, 2024

@pchilano This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8335409: Can't allocate and retain memory from resource area in frame::oops_interpreted_do oop closure after 8329665

Reviewed-by: dholmes, stuefe, coleenp, shade

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 75 new commits pushed to the master branch:

  • dcf4e0d: 8335966: Remove incorrect problem listing of java/lang/instrument/NativeMethodPrefixAgent.java in ProblemList-Virtual.txt
  • 1472124: 8333364: Minor cleanup could be done in com.sun.crypto.provider
  • 7e11fb7: 8335688: Fix -Wzero-as-null-pointer-constant warnings from fflush calls in jvmti tests
  • 531a6d8: 8335911: Document ccls indexer in doc/ide.md
  • 0e0dfca: 8330806: test/hotspot/jtreg/compiler/c1/TestLargeMonitorOffset.java fails on ARM32
  • f3ff4f7: 8335882: platform/cgroup/TestSystemSettings.java fails on Alpine Linux
  • 8f62f31: 8335906: [s390x] Test Failure: GTestWrapper.java
  • 2a29647: 8334777: Test javax/management/remote/mandatory/notif/NotifReconnectDeadlockTest.java failed with NullPointerException
  • 564a72e: 8335955: JDK-8335742 wrongly used a "JDK-" prefix in the problemlist bug number
  • 9c7a6ea: 8312125: Refactor CDS enum class handling
  • ... and 65 more: https://git.openjdk.org/jdk/compare/fac74b118f5fda4ec297e46238d34ce5b9be1e21...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Jul 3, 2024

@pchilano The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@pchilano pchilano marked this pull request as ready for review July 3, 2024 16:39
@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 3, 2024
@mlbridge
Copy link

mlbridge bot commented Jul 3, 2024

Webrevs

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanations. A couple of minor (pre-existing) nits but changes are good.

Thanks

"Should not resource allocate the _bit_mask");
assert(from->has_valid_mask(),
"Cannot copy entry with an invalid mask");
// The expectation is that this InterpreterOopMap is a recently created
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/is a recently/is recently/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 177 to 179
#ifdef ASSERT
_resource_allocate_bit_mask = true;
_used = false;
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit pre-existing: use of DEBUG_ONLY would be more consistent with later setting of _used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -128,11 +129,12 @@ class InterpreterOopMap: ResourceObj {

public:
InterpreterOopMap();
~InterpreterOopMap();

// Copy the OopMapCacheEntry in parameter "from" into this
// InterpreterOopMap. If the _bit_mask[0] in "from" points to
// allocated space (i.e., the bit mask was to large to hold
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit pre-existing: s/to/too/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 135 to 138
// in-line), allocate the space from a Resource area.
// in-line), allocate the space from the C heap.
void resource_copy(OopMapCacheEntry* from);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name resource_copy seems somewhat of a misnomer given it may be C heap. Is it worth changing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this should probably be copy_from, and rename the parameter src. Or something like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought about renaming it but ended up leaving it as is in v1. I changed it to Coleen's suggestion.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 4, 2024
Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a couple of nits, but this looks good. Thanks for tracking down the history and verifying that its an unusual situation that we were optimizing for.

// a resource area for better performance. InterpreterOopMap
// For InterpreterOopMap the bit_mask is allocated in the C heap
// to avoid issues with allocations from the resource area that have
// to live accross the oop closure (see 8335409). InterpreterOopMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't usually put bug numbers in the code and after this change nobody will want to move this back to resource area, so putting the bug number as a caution shouldn't be needed. If one wants to know the details, they can git blame this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

// a resource area for better performance. InterpreterOopMap
// For InterpreterOopMap the bit_mask is allocated in the C heap
// to avoid issues with allocations from the resource area that have
// to live accross the oop closure (see 8335409). InterpreterOopMap
// should only be created and deleted during same garbage collection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add 'the' to "during the same garbage collection."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 135 to 138
// in-line), allocate the space from a Resource area.
// in-line), allocate the space from the C heap.
void resource_copy(OopMapCacheEntry* from);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this should probably be copy_from, and rename the parameter src. Or something like that.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Jul 5, 2024
@@ -89,7 +90,7 @@ class InterpreterOopMap: ResourceObj {

protected:
#ifdef ASSERT
bool _resource_allocate_bit_mask;
bool _used;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this a DEBUG_ONLY() too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One tiny nit.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks!

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 5, 2024
Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still good.

Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine. I am tracking this for backport to 21.0.5, which already got the ResourceMark in frame::oops_interpreted_do due to JDK-8329665 backport.

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nits, otherwise good. Thanks a lot for fixing.

InterpreterOopMap::~InterpreterOopMap() {
if (mask_size() > small_mask_limit) {
assert(!Thread::current()->resource_area()->contains((void*)_bit_mask[0]),
"The bit mask should be allocated from the C heap");
Copy link
Member

@tstuefe tstuefe Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably, this assert is not needed. In debug builds, we have NMT enabled, and that does a check on os::free.

However, an assert that _bit_mask[0] != 0 would make sense, since the free quielty swallows null pointers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. We could had such case if the mask was never filled due to invalid bci, so I also improved the conditional.

"Should not resource allocate the _bit_mask");
assert(from->has_valid_mask(),
"Cannot copy entry with an invalid mask");
void InterpreterOopMap::copy_from(OopMapCacheEntry* src) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly for another RFE: src pointer should be const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, should be fine to do it in this PR.

#endif
int _num_oops;
intptr_t _bit_mask[N]; // the bit mask if
DEBUG_ONLY(bool _used;)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit. This changes memory layout between debug and release builds, and this is used as part of OopMapCache. Not a big concern, but I usually prefer having the same layout between debug and release to test what we ship.

Can't we not just assert that mask size == USHRT_MAX?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fixed.

assert(_bit_mask[0] != 0, "bit mask was not allocated");
memcpy((void*) _bit_mask[0], (void*) from->_bit_mask[0],
mask_word_size() * BytesPerWord);
memcpy((void*) _bit_mask[0], (void*) src->_bit_mask[0], mask_word_size() * BytesPerWord);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the (void*) cast really needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need them here otherwise we get a compilation error on the conversion from intptr_t to void*. But we don't need them above so I removed those.

// from the C heap as is done for OopMapCache has a significant
// performance impact.
_bit_mask[0] = (uintptr_t) NEW_RESOURCE_ARRAY(uintptr_t, mask_word_size());
_bit_mask[0] = (uintptr_t) NEW_C_HEAP_ARRAY(uintptr_t, mask_word_size(), mtClass);
assert(_bit_mask[0] != 0, "bit mask was not allocated");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assert can be removed, no? NEW_C_HEAP_ARRAY does a null check by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, removed.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Jul 8, 2024
Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good. thanks!

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 8, 2024
Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still good.

@pchilano
Copy link
Contributor Author

Thanks for the reviews @dholmes-ora, @coleenp, @shipilev and @tstuefe!

@pchilano
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Jul 10, 2024

Going to push as commit 7ab96c7.
Since your change was applied there have been 82 commits pushed to the master branch:

  • fb66716: 8331725: ubsan: pc may not always be the entry point for a VtableStub
  • fb9a227: 8313909: [JVMCI] assert(cp->tag_at(index).is_unresolved_klass()) in lookupKlassInPool
  • e6c5aa7: 8336012: Fix usages of jtreg-reserved properties
  • e0fb949: 8335779: JFR: Hide sleep events
  • 537d20a: 8335766: Switch case with pattern matching and guard clause compiles inconsistently
  • a44b60c: 8335778: runtime/ClassInitErrors/TestStackOverflowDuringInit.java fails on ppc64 platforms after JDK-8334545
  • b5909ca: 8323242: Remove vestigial DONT_USE_REGISTER_DEFINES
  • dcf4e0d: 8335966: Remove incorrect problem listing of java/lang/instrument/NativeMethodPrefixAgent.java in ProblemList-Virtual.txt
  • 1472124: 8333364: Minor cleanup could be done in com.sun.crypto.provider
  • 7e11fb7: 8335688: Fix -Wzero-as-null-pointer-constant warnings from fflush calls in jvmti tests
  • ... and 72 more: https://git.openjdk.org/jdk/compare/fac74b118f5fda4ec297e46238d34ce5b9be1e21...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jul 10, 2024
@openjdk openjdk bot closed this Jul 10, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jul 10, 2024
@openjdk
Copy link

openjdk bot commented Jul 10, 2024

@pchilano Pushed as commit 7ab96c7.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@pchilano
Copy link
Contributor Author

/backport :jdk23

@openjdk
Copy link

openjdk bot commented Jul 15, 2024

@pchilano Could not automatically backport 7ab96c74 to openjdk/jdk due to conflicts in the following files:

  • src/hotspot/share/interpreter/oopMapCache.hpp

Please fetch the appropriate branch/commit and manually resolve these conflicts by using the following commands in your personal fork of openjdk/jdk. Note: these commands are just some suggestions and you can use other equivalent commands you know.

# Fetch the up-to-date version of the target branch
$ git fetch --no-tags https://git.openjdk.org/jdk.git jdk23:jdk23

# Check out the target branch and create your own branch to backport
$ git checkout jdk23
$ git checkout -b backport-pchilano-7ab96c74-jdk23

# Fetch the commit you want to backport
$ git fetch --no-tags https://git.openjdk.org/jdk.git 7ab96c74e2c39f430a5c2f65a981da7314a2385b

# Backport the commit
$ git cherry-pick --no-commit 7ab96c74e2c39f430a5c2f65a981da7314a2385b
# Resolve conflicts now

# Commit the files you have modified
$ git add files/with/resolved/conflicts
$ git commit -m 'Backport 7ab96c74e2c39f430a5c2f65a981da7314a2385b'

Once you have resolved the conflicts as explained above continue with creating a pull request towards the openjdk/jdk with the title Backport 7ab96c74e2c39f430a5c2f65a981da7314a2385b.

Below you can find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit 7ab96c74 from the openjdk/jdk repository.

The commit being backported was authored by Patricio Chilano Mateo on 10 Jul 2024 and was reviewed by David Holmes, Thomas Stuefe, Coleen Phillimore and Aleksey Shipilev.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot [email protected] integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants