Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreading issues in JRuby? #174

Closed
mlartz opened this issue Jul 11, 2017 · 8 comments
Closed

Multithreading issues in JRuby? #174

mlartz opened this issue Jul 11, 2017 · 8 comments

Comments

@mlartz
Copy link

mlartz commented Jul 11, 2017

I've been having some issues with porting my STOMP consumer to JRuby and am currently getting a deadlock (ref stompgem/stomp#142) when I attemp to require 'logging'.

The STOMP gem uses a thread to receive STOMP messages via the network over TLS, and I use a thread pool to process/ack that data to the STOMP client. I've pared the offending code down to a very small consumer, which doesn't replicate the issue (although it might have other issues).

That said, when I require 'logging' in the stripped down consumer (I use logging in my full application), the demo script deadlocks after it receives/acks a few messages. Unfortunately, I've tried to replicate the issue without the STOMP gem (with just threads/thread pool) but have been unable to. So this issue might pertain to any one of JRuby, JRuby-SSL, STOMP (gem), or logging (gem). Mostly, I'm just wondering if 'logging' does anything tricksy on load that might have an effect on threading/locking/IO/SSL under JRuby?

I'm using jruby 9.0.3.0 (2.2.2) 2015-10-21 633c9aa Java HotSpot(TM) 64-Bit Server VM 25.45-b02 on 1.8.0_45-b14 +jit [linux-amd64]

@TwP
Copy link
Owner

TwP commented Jul 13, 2017

Looking at your example in stompgem/stomp#142 (comment), if you remove the require 'logging' line then you no longer see deadlocks? Is that correct?

@mlartz
Copy link
Author

mlartz commented Jul 13, 2017

Exactly. Removing the require 'logging' line causes the script to not deadlock, however there are some other issues, as noted by the author of the STOMP gem. This obviously calls logging into question, but the jstack trace seems to indicate that its some interaction with Jruby, SSLSocket, and IO when both reading and writing to the TLS "socket". Although it gets a little confusing, because somehow the threads/processes get labeled with "diagnotic_context.rb".

@TwP
Copy link
Owner

TwP commented Jul 13, 2017

The only interesting thing the logging framework is doing with threads is in relation to the diagnostic contexts. The goal of that chunk of code is to capture the diagnostic context from the parent thread when a new thread is spawned. The only way to figure out the parent thread is to capture the information when a new thread is created, so that is why new, start, and fork are being monkey patched.

This behavior can be overridden via an environment variable. Would you be willing to run an experiment with your deadlock script and run it with the require 'logging' line present and with the LOGGING_INHERIT_CONTEXT=false environment variable set? I'd like to see if that might be contributing to the deadlock issues you are seeing.

@mlartz
Copy link
Author

mlartz commented Jul 13, 2017

(Un?)Fortunately, I found no difference when running with or without the environment variable.

LOGGING_INHERIT_CONTEXT=false jruby -J-Xmx2048m -S bundle exec ruby bin/deadlock_test

Other than threading, does logging do any monkeypatching with IO objects?

@TwP
Copy link
Owner

TwP commented Jul 13, 2017

(Un?)Fortunately

Ha Ha Ha Ha! 😂

That made me laugh - thank you 🙇‍♂️

The sad news here is that the logging framework is not doing any crazy initialization or anything related to sockets or IO. You have to instantiate appenders and loggers before any actions are taken.

Here is the list of files required by the logging framework:

  • fileutils
  • little-plugger
  • multi_json
  • rbconfig
  • set
  • singleton
  • socket
  • stringio
  • syslog
  • thread
  • yaml

One debugging option here is to replace the require 'logging' with a bunch of require lines for each of the libraries listed above. Remove the require statements one by one to see if one of them might contributing to the issue. Other than that I'm pretty stumped.

Is there any output about which thread is deadlocking and where?

@gmallard
Copy link

Notes from the stomp gem lead:

I cannot recreate a deadlock (or block/hang) using this test code:

https://gist.github.com/gmallard/f7c709bc26d5a16b6db6ebeeea20f18a

Here is a short test log:

https://gist.github.com/gmallard/1641573d0e040ba2dda4328e08e3071a

Investigation of the issue reported to me uncovered another bug - but this one will not recreate.

For reference:

stompgem/stomp#142

@mlartz
Copy link
Author

mlartz commented Jul 17, 2017

Thanks for the debug tips. I was able to replicate a deadlock without any of gem imports, and was later able to replicate the deadlock without logging required, however it took much longer for the deadlock to occur (20 seconds vs 20 minutes). So at this point it seems like logging is not involved, sorry for the non-bug report. Everything is OK!

@TwP
Copy link
Owner

TwP commented Jul 17, 2017

@mlartz thank you so much for the follow up on this issue. Good luck with the deadlock debugging! 👍

@TwP TwP closed this as completed Jul 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants