-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixes a race triggered by a removed wait condition in Xlib 1.7.3.1 (_XReply) #248
base: master
Are you sure you want to change the base?
Conversation
@avallach2000 Now the startup code behaves (again) exactly like before. Only it is a bit easier to follow the code! 😸 Besides the fact that the execution order of start/run of both threads is not guaranteed this is not our issue here! The only way I currently see on LXQt's side is to:
@tsujan Can you please review this also? |
@antis81 Now I don't see libX11 error message, but still see a warning from QSocketNotifier. Then daemon still exists with return value of 1.
Just an assumption: maybe we can disable simultaneous Here's how the main loop looks like after putting some
|
@avallach2000 Thanks for providing the log! 👍 With that information we can now talk to the Xlib dev's. I am sure they can fix it without having to revert the patch (which I assume is correct, but incomplete). @tsujan As far as I understand we cannot actually do anything here as the socket connection is shut down by libx11/libxcb internally in such case. The case where XEvent pointer can actually become NULL is in libxcb - called via |
I didn't want to sound pessimistic, especially when others tried to do something from a different perspective, but that was my opinion from start. It's based on the available evidence but not something I'm 100% sure about. |
@tsujan Actually we are both wrong! 🎉 😆 As @avallach2000 stated earlier
Hence tables just turned and we can actually fix it. |
I won't interfere with this. My opinion is as before. |
@antis81
Somehow it still falls to built-in libX11's |
What about |
And yet this is incorrect: |
@avallach2000 This is what I mean: void Core::wakeX11Thread() {
#if 0
…
#endif
} |
@antis81
The only way to terminate daemon after that is to send SIGKILL. |
Which means the daemon is running just fine. :slight_smile: EDIT: Those "[Critical]" messages you see origin from |
But they are caused by SIGTERM (Ctrl-C). Didn't I confuse you even more? |
Logical: lxqt-globalkeysd is designed to run "forever" in the background (living in lxqt-session). It does not quit until Replacing the void Core::wakeX11Thread()
{
if (currentThread() != qApp->thread()) {
log(LOG_DEBUG, "Core::wakeX11Thread can only be called from main thread");
return;
}
if (mInterClientCommunicationWindow)
… Note that our "listener thread" is always awake when calling this function (Why send/receive events during event handling?). Could you pull and check if it still races? (In best case you should see the log message in debug output and it no longer races.) |
No luck. Never saw the mentioned debug message. Here's the shortest of my tries:
EDIT: I've used fresh config for this, when using the old one it fails after registerClientAction or addClientAction. |
Thanks!
Yeah, because everything happens in Core::start() here and here. The |
That's not very useful, it's too asynchronous:
|
Now we know that Let's put more logging around
|
- XInitThreads (and X object creation) is correctly done in main thread. - No longer segfaults when another daemon is already running.
@avallach2000 After disabling the @tsujan I cannot see any side effects by not calling XFlush explicitly after XSendEvent (plus checkX11Error -> reads back the error pipe). Note that in libX11 XFlush does not lock their internal mutex while XPeekEvent does lock. Since we call XFlush from another thread than XPeekEvent (and the latter also flushes), removing it may actually have fixed our race… |
@antis81
|
Unsure if it helps, but I attached a couple of trace logs created with Second column is relative time, so you can use something like |
@avallach2000 Could you re-check this please? According to what @tsujan found in this comment putting a mutex locker inside "waitForX11Error" now changes the locking behaviour "on our side" and (there's slight chance this has a positive effect for our issue). |
I don't see any changes. Helgrind's output looks practically the same, except for two new |
Thank you again! FYI: That "broadcast" warning is likely from this one. In case you don't know already:
Also looking at the thread setup: Why do we have 4 threads (I'd expect 3)? |
Seems to work. Please test it well so we don't introduce anything harmful!
@avallach2000 Have another one ready for testing. I flipped the "data" mutex logic. Does this reveal anything to the helgrind log? This change is a little "brave"! If it doesn't work I am ready to revert a19fd41 completely. @tsujan Would be also happy to get your feedback on this. |
@antis81 It seems now we have more attempts to unlock non-locked locks, but less recursive write lock warnings. Am I right that you can't reproduce such behavior on your system? |
@avallach2000 Reproducing the behaviour is difficult as it occurs somewhat rare on my system (similar to @tsujan setup). Found a very old (like Qt3 era) article about GLX programming stating this:
Thought about that for some time now since QApplication also utilizes XCB/Xlib. At least it does no harm. |
@avallach2000 Can you "bulk run" this again please? Seems to work on my system. |
It seems to work fine. |
fixes #247
While the original race condition was caused by a call to XSynchronize (in an async context!) the PR carries out quite some additional goodies.
Credits to @stefonarch @avallach200 and @tsujan for great support in sorting this one out!