-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memoryleak in FluxOnAssembly #3941
Conversation
The leak only happens when exceptions are thrown. The reasons why one of our applications are so heavily affected are that:
The exceptions are |
} | ||
}); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also fixes an issue with the "counting mechanism", that the "parallel" unit tests were exercising. For example,
While in the new.txt stacktrace, you see
|
Hey, thanks for the description of your issue. I did look into the implementation to map out the things you raise and I don't see an indication why the mentioned implementation details would lead to a leak in any way. I am not saying there are no leaks, just the guess that the The intention of using the Unless you are able to provide a reproducer for the leak caused by the above I don't think we will be able to apply the proposed change, as it goes against the original design of this functionality as described above. However, I think the source of the problem is described in your comment about reusing static instances of exceptions and is triggered by: final Throwable fail(Throwable t) {
boolean lightCheckpoint = snapshotStack.isLight();
OnAssemblyException onAssemblyException = null;
for (Throwable e : t.getSuppressed()) {
// ...
}
// ...
t = Exceptions.addSuppressed(t, onAssemblyException);
/...
} This causes your static exception to accumulate all the checkpoint data. This is a similar issue to #3371. I do not have any solution in mind at this time rather than the suggestion to avoid the shared pre-instantiated and static exceptions or avoid using the checkpoint operator in case you're not willing to give that up. Regarding the |
@chemicL , thank you for your response. I found out that I can use AspectJ on the What would you say to a System Property that would enable this behavior via configuration? Completely opt-out. Besides losing some debugging information, what other downside is there?
Some possible options on uncontrolled growth,
|
Setting
|
Unless this becomes more of a burden, I'd prefer not to add a global property to disable the checkpoint and debug agent functionality. As you noted, disabling suppression on your static exception types addresses the concern. We also investigated reactor-netty's static exceptions and they are also disabling suppression. In the Spring Framework codebase there are no static exceptions in use for these reasons and stack capturing is disabled in some of them on the hot path reducing the impact significantly. |
@chemicL what about limiting the number of objects that can be added to the Map/Set? Is there any real use case where you would expect many items in them? 25? 50? 100? The problem with the current code is that its extremely difficult for someone impacted by this to figure out what is wrong. My project has been impacted for 3 years and we finally figured it out, only because someone took Java Flight Recorder snapshots in production over the course of a week and luckily stumbled upon the right clues. |
It's really difficult to say what number is adequate as a default limitation. There are users who report expectations of thousands of operators in a reactive chain - should we limit their ability to investigate their chains because we decide on some number? My experience is that such arbitrary limitations are also difficult to uncover without knowing what you're looking for. Perhaps a better approach is to explicitly mention static exceptions in our reference documentations? WDYT? |
Yep, a mention about setting |
In case of enabled debugging or checkpoints, usage of static exceptions in user code can cause memory leaks. A warning note has been added to the debugging section. Related to #3941
@travispeloton I opened a PR for the documentation note. |
In case of enabled debugging or checkpoints, usage of static exceptions in user code can cause memory leaks. A warning note has been added to the debugging section. Related to #3941
Please also take note of reactor/reactor-netty#3529 |
This PR contains one possible fix for a memory leak discovered in
FluxOnAssembly
.My colleague, @foo4u discovered a memory leak in one of our applications, and I believe that it is caused by some code in
FluxOnAssembly
. ThenodesPerId
HashMap
expects the key to be derived from callinghashCode()
on an object, but insideFluxOnAssembly.OnAssemblyException#add(Publisher<?>, Publisher<?>, String, String)
the debugger is showing that anAssemblyOp
implementation is the object type. None of theAssemblyOp
implementations implementhashCode()
, so we're getting the integer returned byObject.hashCode()
instead.What I'm observing in a Kotlin Spring Webflux project is that every time we make a HTTP request to a
Controller
, this code path is being hit three times, which results in three objects being added to theHashMap
.FluxOnAssembly
hasn't been changed in three years, and to be honest, thenodesPerId
and how it is used in this class is odd. It might only be used to increment a counter, but it also is protecting a synchronized block when theroot
object is accessed.root
has aHashSet
that also contains the same objects asnodesPerId
, and they are traversed in thegetMessage()
function.All the
AssemblyOp
implementations overridetoString()
with something that looks unique, and was probably what the original author intended to base thenodesPerId
key on.Here is an example of three keys, from this PR, attributed to a single HTTP request:
Willing to discuss!
Also! There are some other uses of
System.identityHashCode
in the codebase, but I don't have insights into them at the moment.