-
-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Next hangs after a while (Guix build) #680
Comments
@jellelicht reported the following strace:
And it keeps looping on SIGUSR1 forever. |
Possibly a starvation issue? Two processes trying to access the same resource? That's the only reason I can think of why it cannot accquire a lock |
Do you experience hangs yourself?
Possibly after many hours? I do sometimes, but it's hard to reproduce
on my end.
|
I never experience hangs, no. Even if a particular view hangs (which I can reliably do by trying to play any audio...) I can always kill the buffer and load another. |
I can reproduce the hang after 10 minutes using the Guix recipe. |
What is the Guix recipe? |
build-scripts/guix.scm
|
Are you saying that if you use the Guix recipe to build Next it hangs after 10 minutes, but otherwise it hangs after an indeterminate out of time? |
With Guix, I could reproduce the hang after 10 minutes.
With Quicklisp, I usually don't get hangs. I did in the past but they
may be fixed now.
|
I've updated most Common Lisp Guix packages: now Next should be using approximately the same Common Lisp library versions when built with Guix or Quicklisp. That said, I am still experiencing the issue with Guix.
|
Tried something else: In Guix, sbcl-cffi uses the gnu-build-system |
I have never had Next hang at all. I rarely use it for long under Guix (I start the Guix virtual machine only for precise needs, as it occupies a lot of memory), but under macOS I only quit Next to update to a newer commit. Still, it never hangs. Could the communication between the two processes via dbus be a potential cause? My impression is that all UI events pass through it, so if it blocks, Next would appear to hang. |
I'm talking about Next on master which does not use DBus.
Have you tried it? Does it ever hang for you? I know it hangs for at
least 3 Guix users: @arunisaac, @jellelicht and me :)
|
Well, I am using Next on master. I wasn't aware that it doesn't use DBus any more! |
Thanks for reporting. The issue could then be a race condition that heavily depends on the hardware.
|
what if you run Next in a VM with ssh X forwarding in a Guix build, does it crash then? |
The VM will help you determine if it is hardware dependent (unless you are using a pass through access to hardware directly) |
Not sure it will help: if it's a matter of CPU speed, this issue can
happen at random. It could happen in a VM too.
|
I don't know enough about virtualization to comment with any certainty. In any case, I would try it |
- Or...? Any idea?
I'm sorry I don't have any useful insight to offer. But, if I remember
correctly, this problem didn't occur in older versions of Next. Would it
be worth bisecting the git commit log?
|
By older versions, you means 1.5?
If it's before the switch to the cl-webkit FFI, yes, this is expected.
What I'm going to try next:
- Make a Guix package including the Quicklisp distribution: this should
work.
- Then "diffoscope" the above package with a Guix package that only
depends on cl-* inputs (the source libraries). This should highlight
the differences.
|
By older versions, you means 1.5?
Yes.
If it's before the switch to the cl-webkit FFI, yes, this is expected.
Ok.
What I'm going to try next:
- Make a Guix package including the Quicklisp distribution: this should
work.
- Then "diffoscope" the above package with a Guix package that only
depends on cl-* inputs (the source libraries). This should highlight
the differences.
Sure, thank you for your work!
|
And... I just got it to work with Guix!
I've switched to using the cl-* source package directly and that did it!
Why are the SBCL packages causing this issue? SBCL packages in Guix use
ASDF's `compile-bundle-op` operation. I've seen issue with it in the
past (ruricolist/serapeum#42 and
ruricolist/serapeum#55 for instance). These
issues are usually caught at compilation time, but I don't know if we
can be sure that no issue sneaked through unnoticed.
My guess is that compile-bundle-op causes issues with CFFI. But this
could also be a red herring. It would take time to narrow down this bug
more specifically.
In the mean time, Next can go forward using cl-* inputs! Before closing
this bug, we need to merge https://issues.guix.info/issue/41135 in Guix.
|
Congratulations on the fix! |
Is there anything I can do to easily test this already? I'm also not quite sure why a patch to guix' asdf-build-system has anything to do with the now-used gnu-build-system 🤔 EDIT: or is it because all of the cl-source packages were kind of 'broken' before? |
This is because I've changed the guix.scm recipe to rely on the
Makefile for building. I'm using the gnu-build-system only for `make`.
The Makefile then takes care of building Next upon its inputs which
I've changed from sbcl-* to cl-*.
I need to fix asdf-build-system to build cl-* packages properly (this
was a long-standing bug in the asdf-build-system).
To test, you need to:
- Apply the Guix patch (v2 or above).
- Apply the pull request patch.
- Build the patched Next as usual from your patched Guix:
```
~/guix/pre-inst-env guix build -f build-scripts/guix.scm
```
|
After more testing, I've narrowed down the issue to sbcl-cl-cffi-gtk: switching to cl-cffi-gtk fixes the issue. I'll report upstream. |
Well, it seems to work (20 minutes in and no hangs). Congrats on the fix! |
Pfeeeeeeew! Thanks for testing, it really made my day (/week/month) :D
|
I tested and it works for me too! Now, I can finally start using Next
full time. Great work! :-)
|
Thanks a lot for reporting!
|
I've built current master, than Here is my system info:
Cheers! |
This is because the Guix patch hasn't been pushed to master yet.
I'll do it tomorrow, stay tuned!
|
Fixed in b24e9a3. Feel free to reopen if the issue persists. |
It seems that Next hangs after a while.
It happened to me after many hours, and it seems when I tried to open a URL externally.
(Could be a red herring.)
Possible causes:
@jellelicht If you have a recipe please share :)
The text was updated successfully, but these errors were encountered: