Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call Py_FinalizeEx() when process exits #187

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

snickell
Copy link

@snickell snickell commented Aug 8, 2024

Fixes #186

Currently: if you use PyCall from a "side thread" (="not the main thread), when you exit the process does not exit. See #186 for more information.

Using this PR, the "side thread" may manually call PyCall.initialize before it exits. Then the main thread will exit properly. Unfortunately, it is not possible to automatically call PyCall.initialize in a side-thread, because at_exit only runs on the main thread, and there is no handler for thread.on_exit.

This PR:

  1. adds PyCall.finalize(), which calls Py_FinalizeEx()
  2. automatically calls PyCall.finalize() at_exit, if initialized on the main thread

A secondary advantage of this PR: it cleans up python memory before exit, which might make it easier to use valgrind and other memory debugging tools.

Example using this PR:

#!/usr/bin/env ruby

side_thread = Thread.new do
  require 'pycall'
  PyCall.import_module('sys')

  PyCall.finalize # if this line is commented out, the process will hang on exit
end
side_thread.join

#=> process exits!

Before this PR (=comment out PyCall.finalize), the process would never exit even after both threads exited.

@snickell snickell changed the title at_exit: call Py_FinalizeEx() Call Py_FinalizeEx() when process exits Aug 8, 2024
@snickell snickell marked this pull request as ready for review August 8, 2024 10:10
@snickell
Copy link
Author

snickell commented Aug 8, 2024

This may still have issues with pandas 🐼: after the finalize, when the process exits, it segfaults after the at_exit handlers. I need to figure out how to debug this in lldb.

@snickell snickell marked this pull request as draft August 8, 2024 11:48
@snickell
Copy link
Author

snickell commented Aug 11, 2024

It might be necessary to unregister gc objects before calling Py_FinalizeEx. These are destroyed automatically at process exit, but the (PyObject *) pointers were previously invalidated by Py_FinalizeEx, so the process segfaults at exit.

Investigating Destructors

When a Ruby-refs-Python object is destroyed by Ruby:

PyCall.gcguard_table (class is gcguard_data_type in C)

Initialized when pycall.so starts:

  • pycall_init_gcguard()
    • PyCall.gcguard_table = gcguard_new()
      • gcguard_new()
        • TypedData_Make_Struct(0, struct gcguard, &gcguard_data_type, gg)
        • gg->guarded_objects = st_init_numtable() # a "Hash"

When PyCall.gcguard_table is destroyed by Ruby:

  • gcguard_data_type.function.dfree()
    • gcguard_free(gcguard *gg)
      • st_free_table(gg->guarded_objects)
      • PyCall.gcguard_table = nil

When a Python-refs-Ruby object is destroyed by Python:

  • PyRuby_Type.tp_dealloc()
    • PyRuby_dealloc_with_gvl()
      • PyRuby_dealloc()
        • pycall_gcguard_unregister_pyrubyobj()
          • pycall_gcguard_delete(PyObject *pyobj)
            • gcguard = rb_ivar_get(mPyCall, id_gcguard_table)
            • gcguard_delete(gcguard, pyobj)

pycall_gcguard_register(), does not appear to be used (?)

pycall_gcguard_register() registers weak-refs to Python objects, which call pycall_gcguard_delete() when they are destroyed. It does not appear to be used, but I have documented it anyway to be careful.

  • pycall_gcguard_register(PyObject *pyobj)
    • wref = Py_API(PyWeakref_NewRef)(pyobj, weakref_callback_pyobj);
    • pycall_gcguard_aset(wref, obj)
    • Later, when pyobj is garbage collected by python it will run:
      • weakref_callback_pyobj
        • gcguard_weakref_destroyed()
        • pycall_gcguard_delete(PyObject *weakref)
          • gcguard = rb_ivar_get(mPyCall, id_gcguard_table)
          • gcguard_delete(gcguard, pyobj)

Initializing pycall.so registers the weakref_callback_pyobj() callback:

  • pycall_init_gcguard()
    • weakref_callback_pyobj = Py_API(PyCFunction_NewEx)(&gcguard_weakref_callback_def, NULL, NULL);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PyCall on thread != main: process will not exit
1 participant