Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arbiter: Handle SIGCHLD like all other signals + misc signal handling improvements #3148

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Commits on Aug 17, 2024

  1. arbiter: Handle SIGCHLD in normal/main process context

    ... as opposed to in signal context. This is beneficial, since it
    means that we can, in a signal safe way, print messages about why
    e.g. a worker stopped its execution.
    
    And since handle_sigchld() logs what it does anyway, don't bother
    printing out that we're handling SIGCHLD. If workers are killed at
    rapid pace, we won't get as many SIGCHLD as there are workers killed
    anyway.
    sylt committed Aug 17, 2024
    Configuration menu
    Copy the full SHA
    5b33c01 View commit details
    Browse the repository at this point in the history
  2. arbiter: Remove PIPE and only use SIG_QUEUE instead

    Since we can use something from queue.*, we can make it blocking as
    well, removing the need for two different data structures.
    sylt committed Aug 17, 2024
    Configuration menu
    Copy the full SHA
    d653ebc View commit details
    Browse the repository at this point in the history
  3. arbiter: Use waitpid() facilities to handle worker exit status

    This change is meant to handle the return value of waitpid() in a way
    that is more in line with the man page of said syscall. The changes
    can be summarized as follows:
    
    * Use os.WIFEXITED and os.WIFSIGNALED to determine what caused
      waitpid() to return, and exactly how a worker may have exited.
    
    * In case of normal termination, use os.WEXITSTATUS() to read the exit
      status (instead of using a hand rolled bit shift). A redundant log
      was removed in this code path.
    
    * In case of termination by a signal, use os.WTERMSIG() to determine
      the signal which caused the worker to terminate. This was buggy
      before, since the WCOREFLAG (0x80) could cause e.g. a SIGSEGV (code
      11) to be reported as "code 139", meaning "code (0x80 | 11)".
    
    * Since waitpid() isn't called with WSTOPPED nor WCONTINUED, there's
      no need to have any os.WIFSTOPPED or os.WIFCONTINUED handling.
    sylt committed Aug 17, 2024
    Configuration menu
    Copy the full SHA
    052448a View commit details
    Browse the repository at this point in the history
  4. arbiter: Reinstall SIGCHLD as required by some UNIXes

    According to the python signal documentation[1], SIGCHLD is handled
    differently from other signals. Specifically, if the underlying
    implementation resets the SIGCHLD signal handler, then python won't
    reinstall it (as it does for other signals).
    
    This behavior doesn't seem to exist for neither Linux nor Mac, but
    perhaps one could argue that it's good practise anyway.
    
    [1] https://docs.python.org/3/library/signal.html
    sylt committed Aug 17, 2024
    Configuration menu
    Copy the full SHA
    b3db5b9 View commit details
    Browse the repository at this point in the history
  5. arbiter: clean up main loop

    * Look up handlers in __init__() to induce run-time error early on if
      something is wrong.
    
    * Since we now know that all handlers exist, we can simplify the main
      loop in arbiter, in such a way that we don't need to call wakeup().
    
    So after this commit, the pipe in arbiter is only used to deliver
    which signal was sent.
    sylt committed Aug 17, 2024
    Configuration menu
    Copy the full SHA
    64387d1 View commit details
    Browse the repository at this point in the history

Commits on Aug 18, 2024

  1. arbiter: Add Arbiter:wakeup() method

    It accepts an optional "due_to_signal" argument which can be used to
    tell if the wakeup was made because a signal handler needs to be
    executed or not.
    sylt committed Aug 18, 2024
    Configuration menu
    Copy the full SHA
    7ecea2d View commit details
    Browse the repository at this point in the history