Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable SIGINT handler in pyjlwrap_call #574

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

tkf
Copy link
Member

@tkf tkf commented Sep 19, 2018

This PR makes the following example work:

julia> py"""
       try:
           print("Hit Ctrl-C a few seconds later!")
           $sleep(10)
       except KeyboardInterrupt as err:
           print("Got:", err)
       else:
           print("Got nothing")
       """
Hit Ctrl-C a few seconds later!
^CGot: Julia exception: InterruptException()

Without this PR, the exception is raised in Julia side after the sleep is completed (i.e., no early termination):

julia> py"""
       try:
           print("Hit Ctrl-C a few seconds later!")
           $sleep(10)
       except KeyboardInterrupt as err:
           print("Got:", err)
       else:
           print("Got nothing")
       """
Hit Ctrl-C a few seconds later!
^CGot nothing
ERROR: InterruptException:

@tkf
Copy link
Member Author

tkf commented Sep 19, 2018

So I had to wrap some more ccalls with disable_sigint to make the test work. Let me know if it has to be in sigatomic_begin-sigatomic_end in try-finally (as it looks like that's how it was done in PyCall).

BTW, is it safe to call sigatomic_end in finally? I'm puzzled since the comment in disable_sigint says "Exception unwind sigatomic automatically" and sigatomic_end is not in finally. I looked at JL_SIGATOMIC_END but it doesn't look like idempotent.

@tkf
Copy link
Member Author

tkf commented Sep 19, 2018

Hmm.... I guess the ccalls I wrapped are not enough since almost everything in Python can call anything, even isinstance:

julia> py"""
       from abc import ABC

       jl_println = $println

       class Any(ABC):
           @classmethod
           def __subclasshook__(cls, subclass):
               jl_println("It's like Julia's Any")
               return True
       """

julia> pyisinstance(py"1"o, py"Any")
ERROR: sigatomic_end called in non-sigatomic region

It's cumbersome to wrap almost everything with disable_sigint. Can we define our own ccall that wraps it with disable_sigint (possible by using macro)?

@tkf
Copy link
Member Author

tkf commented Sep 21, 2018

I just go ahead and add @ccall macro to wrap disable_sigint.

It breaks indent everywhere so let me know if it's a way to go and you want the idents to be fixed in this PR. (I really wnat juliafmt for this...)

@@ -110,7 +124,7 @@ function pydecref(o::PyObject)
end

function pyincref_(o::Union{PyPtr,PyObject})
ccall((@pysym :Py_IncRef), Cvoid, (PyPtr,), o)
@pyccall(:Py_IncRef, Cvoid, (PyPtr,), o)
Copy link
Member Author

@tkf tkf Sep 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess Py_DecRef can call __del__ so it should probably be protected but I wonder if Py_IncRef can ever call any Python functions.

https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_dealloc

@tkf
Copy link
Member Author

tkf commented Sep 21, 2018

Hmm... With this change, python -c 'from julia import Julia; Julia(); from julia import Main' still works but python -c 'from julia import Main' gets the error sigatomic_end called in non-sigatomic region.

(Using: JuliaPy/pyjulia@2dc8148 and 0b810b5)

@tkf
Copy link
Member Author

tkf commented Sep 21, 2018

Actually, python -c 'from julia import Julia; Julia().eval("identity")(0)' is enough to get sigatomic_end called in non-sigatomic region, because PyJulia is not in the sigatomic_begin--sigatomic_end context. Started to feel like python-jl CLI JuliaPy/pyjulia#200 is the best approach to run PyJulia...

@tkf
Copy link
Member Author

tkf commented Sep 21, 2018

Calling jl_sigatomic_begin at the end of libjulia initialization works:

$ python -c 'from julia import Main; Main.println("Sleeping..."); Main.sleep(10)'
Sleeping...
^CTraceback (most recent call last):
  File "<string>", line 1, in <module>
KeyboardInterrupt: Julia exception: InterruptException()

# context.
macro ccall(args...)
quote
disable_sigint() do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a higher-order function call to every ccall; have you checked whether it makes a measurable difference in performance? (Hopefully it all gets inlined, but…)

I really wish this would get fixed in Julia itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, JuliaLang/julia#2622 is closed, does that mean we don't need this anymore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, now I vaguely remember JuliaLang/julia#16174 — we still need this for any ccall in that might call back to Julia code in which we want to catch interrupt exceptions, I think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment should clarify this, in any case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean a comment like this?:

Although #2622 is closed, ccall still does not automatically disable SIGINT.

have you checked whether it makes a measurable difference in performance?

No, but the code without it was unsafe. It manifested easily because I added reenable_sigint so that julia runtime complains. Wasn't it needed to avoid setfault like #15?

Having said that, there are places where we can merge disable_sigint and do it in a batch while using ccalls. If we detect performance regression I guess we can use such tricks. I need to look at benchmarks/ code to do the check.

@yuyichao
Copy link
Collaborator

I'm confused by what this fixes. AFAICT, there shouldn't be any need for any disable/enable_sigint calls. They should only be thrown in julia code and whenever julia code runs.

@yuyichao
Copy link
Collaborator

Also, just tested locally, aborting in the code you have in the top post works just fine from julia REPL.

@yuyichao
Copy link
Collaborator

yuyichao commented Sep 23, 2018

julia> py"""
       try:
           print("Hit Ctrl-C a few seconds later!")
           $sleep(10)
       except KeyboardInterrupt as err:
           print("Got:", err)
       else:
           print("Got nothing")
       """
^CERROR: LoadError: InterruptException:
Stacktrace:
 [1] @py_str(::LineNumberNode, ::Module, ::Any, ::Vararg{Any,N} where N) at /home/yuyichao/.julia/packages/PyCall/rUul9/src/pyeval.jl:200
in expression starting at REPL[4]:1

@tkf
Copy link
Member Author

tkf commented Sep 24, 2018

@yuyichao You need to be a bit patient and wait until you see the message Hit Ctrl-C a few seconds later! from Python and then maybe a few subseconds more until the Julia-Python callback is JIT'ed and finally Julia's sleep is called. Also, please see that in my example Got: Julia exception: InterruptException() is from Python's except KeyboardInterrupt as err block; it's not Julia's stacktrace.

My example uses PyCall.jl's mechanism to translate Julia's InterruptException to Python's KeyboardInterrupt:

pyexc[InterruptException] = @pyglobalobjptr :PyExc_KeyboardInterrupt

However, IIUC, in order for this to work, SIGINT handler has to be enabled during Julia's sleep.

@yuyichao
Copy link
Collaborator

ccall still does not automatically disable SIGINT.

It should not.

OK, so PyCall is still using private API to disable sigint, it should just use disable_sigint instead. No try-catch is needed. The ccall to handle this is manually inlined and performance hit of this is definitely much smaller than the try catch by at least an order of magnitude.

Note that reenable_sigint does not guarantee that the sigint delivery is re-enabled, it merely undo one level of disable_sigint. It should be possible to add another api to ensure sigint is enabled in a block of code.

@tkf
Copy link
Member Author

tkf commented Sep 24, 2018

No try-catch is needed.

Thanks for the clarification.

It should be possible to add another api to ensure sigint is enabled in a block of code.

That would be great. Is there a Julia issue tracking it? Shall I open one?

I have one more question. When using libjulia, do I need to call jl_sigatomic_begin after calling jl_init_with_image? Without this, it was impossible to call Julia function wrapped by PyCall using PyJulia (as commented above #574 (comment)). The real code is here: tkf/pyjulia@0bb3f5e

My another related worry is that if it is possible to get back to the "0-th level", even all Julia packages I use calls disable_sigint and never raw sigatomic_*. If it happens, is it possible to detect it (so that I can go back to the correct level by calling jl_sigatomic_begin once more)?

@yuyichao
Copy link
Collaborator

Shall I open one?

Sure.

When using libjulia, do I need to call jl_sigatomic_begin after calling jl_init_with_image?

If you don't want julia callbacks to throw across python frame than yes.

If it happens, is it possible to detect it (so that I can go back to the correct level by calling jl_sigatomic_begin once more)?

Not sure if I understand this...

@tkf
Copy link
Member Author

tkf commented Sep 24, 2018

When I was asking that question, I was thinking that something like the following would fail in the way Python cannot control:

from julia import Julia
jl = Julia()
evil_func = jl.eval("""
function evil_func()
    Base.sigatomic_end()
end
""")
evil_func()

However, I just tried it and found that Julia notices the mistake in evil_func and throws Julia exception which then converted to Python one by PyCall.

@tkf
Copy link
Member Author

tkf commented Sep 26, 2018

Hmm... A puzzling thing is that this PR only works in "interactive session". Is exception handler available only in REPL? Note that calling ccall(:jl_exit_on_sigint, Nothing, (Cint,), 0) in hitme.jl didn't fix it (jl_exit_on_sigint(0) does nothing in julia -e 'include("hitme.jl")' anyway, IIUC).

$ cat hitme.jl
@show Base.JLOptions().handle_signals
using PyCall
py"""
$sleep(0)
try:
    print("Hit Ctrl-C!")
    $sleep(10)
except KeyboardInterrupt as err:
    print("Got:", err)
else:
    print("Got nothing")
"""

$ julia < hitme.jl  # "interactive session"
(Base.JLOptions()).handle_signals = 1
1
Hit Ctrl-C!
^CGot: Julia exception: InterruptException()

$ julia -e 'include("hitme.jl")'
(Base.JLOptions()).handle_signals = 1
Hit Ctrl-C!
^Cfatal: error thrown and no exception handler available.
InterruptException()
jl_run_once at /buildworker/worker/package_linux64/build/src/jl_uv.c:185
process_events at ./libuv.jl:98 [inlined]
wait at ./event.jl:246
task_done_hook at ./task.jl:309
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2182
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1536 [inlined]
finish_task at /buildworker/worker/package_linux64/build/src/task.c:233
start_task at /buildworker/worker/package_linux64/build/src/task.c:276
unknown function (ip: 0xffffffffffffffff)
julia -e 'include("hitme.jl")'  6.76s user 0.75s system 70% cpu 10.716 total

$ julia hitme.jl
(Base.JLOptions()).handle_signals = 1
Hit Ctrl-C!
^C
signal (2): Interrupt
in expression starting at /home/takafumi/.julia/dev/PyCall/hitme.jl:3
syscall at /usr/lib/libc.so.6 (unknown line)
uv__epoll_wait at /buildworker/worker/package_linux64/build/deps/srccache/libuv-ed3700c849289ed01fe04273a7bf865340b2bd7e/src/unix/linux-syscalls.c:321
uv__io_poll at /buildworker/worker/package_linux64/build/deps/srccache/libuv-ed3700c849289ed01fe04273a7bf865340b2bd7e/src/unix/linux-core.c:275
uv_run at /buildworker/worker/package_linux64/build/deps/srccache/libuv-ed3700c849289ed01fe04273a7bf865340b2bd7e/src/unix/core.c:360
process_events at ./libuv.jl:98 [inlined]
wait at ./event.jl:246
task_done_hook at ./task.jl:309
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2182
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1536 [inlined]
finish_task at /buildworker/worker/package_linux64/build/src/task.c:233
start_task at /buildworker/worker/package_linux64/build/src/task.c:276
unknown function (ip: 0xffffffffffffffff)
unknown function (ip: 0xffffffffffffffff)
Allocations: 16558214 (Pool: 16554931; Big: 3283); GC: 36
julia hitme.jl  6.90s user 0.55s system 71% cpu 10.416 total

@tkf
Copy link
Member Author

tkf commented Sep 26, 2018

It was because my startup.jl starts some @async tasks. Passing --startup-file=no to the above commands makes them work.

see: JuliaLang/julia#29369

@tkf
Copy link
Member Author

tkf commented Sep 28, 2018

Here is another problem (I'm wondering if it could be a bug in julia). With this branch, evaluating

import DiffEqBase  # it was necessary somehow...
show(IOBuffer(), "text/plain", @doc Base.Enum)

via PyJulia segfaults at jl_compile_linfo (but @doc Array works).

Here is the reproducible environment which can be run by the following commands (note: it does not work in Debian JuliaPy/pyjulia#185):

git clone https://gist.github.com/tkf/5f14bcc6957641be44f3bdba24e83a8c segfault
cd segfault
make

Here is the stacktrace: https://gist.github.com/tkf/048bc3a96576da7957880094cfeeeec8

I split the change in smaller patches and ran git bisect. It finds that this patch defining ^(a::PyObject, b::PyObject) is the first change that causes the segfault. I have no idea why this is related to the above code causing the segfault. Just as a sanity check, I expanded the @pyccall macro manually but it didn't make any difference. I also checked with julia master (47668c82c4efaec9fbbf4e790ff4d36f1c003c06) but the result was the same (segfault at the same commit). At this point, I'm a very puzzled that simply wrapping ccall with disable_sigint causes a segfault, especially because it is supposed to be a safer way to call C functions.

@yuyichao @stevengj Does it make sense to report it to JuliaLang/julia? Is it possible that this is a bug in (lib)julia?

@tkf
Copy link
Member Author

tkf commented Apr 30, 2019

Note 1: There is a patch JuliaLang/julia#31191 mentioning JuliaLang/julia#29498 so maybe the situation is better now?

Note 2: I don't remember if I tried it, but maybe changing

pyjlwrap_call(self_::PyPtr, args_::PyPtr, kw_::PyPtr) =
_pyjlwrap_call(unsafe_pyjlwrap_to_objref(self_), args_, kw_)

to

pyjlwrap_call(self_::PyPtr, args_::PyPtr, kw_::PyPtr) =
    Base.invokelatest(_pyjlwrap_call, unsafe_pyjlwrap_to_objref(self_), args_, kw_)

makes sense? Note that we are doing Base.invokelatest inside _pyjlwrap_call ATM:

function _pyjlwrap_call(f, args_::PyPtr, kw_::PyPtr)
args = PyObject(args_) # don't need pyincref because of finally clause below
try
jlargs = julia_args(f, args)
# we need to use invokelatest to get execution in newest world
if kw_ == C_NULL
ret = Base.invokelatest(f, jlargs...)

@tkf
Copy link
Member Author

tkf commented May 1, 2019

TODO: #683 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants