-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preview: point release v23.1.0 #3
base: master
Are you sure you want to change the base?
Conversation
This resolves a `ResourceWarning` caused by the test suite.
This ensures that unclosed resources will be caught in CI
not even in error reponses!
If all the workers are busy or max connections is reached, new connections will queue in the socket backlog, which defaults to 2048. The `gunicorn.backlog` metric provide visibility into this queue, and give an idea on concurrency, and worker saturation. This also adds a distinction between the `timer` and `histogram` statsd metric types, which although treated the same, can be difference, for e.g. in this case histogram is not a timer: https://github.com/b/statsd_spec#timers
Fix failing lint tests
a7f742c
to
c8cd897
Compare
This reverts commit 4023228. We use sys.exit. On purpose. We should therefore not be catching SystemExit.
This is probably wrong. But less wrong than handling *all* BaseException. Reports of this happening may be the result of some *other* async Timeout in the wsgi app bubbling through to us. Gevent docs promise: "[..] if *exception* is the literal ``False``, the timeout is still raised, but the context manager suppresses it, so the code outside the with-block won't see it."
21034a0
to
fd21207
Compare
... as opposed to in signal context. This is beneficial, since it means that we can, in a signal safe way, print messages about why e.g. a worker stopped its execution. And since handle_sigchld() logs what it does anyway, don't bother printing out that we're handling SIGCHLD. If workers are killed at rapid pace, we won't get as many SIGCHLD as there are workers killed anyway.
Since we can use something from queue.*, we can make it blocking as well, removing the need for two different data structures.
This change is meant to handle the return value of waitpid() in a way that is more in line with the man page of said syscall. The changes can be summarized as follows: * Use os.WIFEXITED and os.WIFSIGNALED to determine what caused waitpid() to return, and exactly how a worker may have exited. * In case of normal termination, use os.WEXITSTATUS() to read the exit status (instead of using a hand rolled bit shift). A redundant log was removed in this code path. * In case of termination by a signal, use os.WTERMSIG() to determine the signal which caused the worker to terminate. This was buggy before, since the WCOREFLAG (0x80) could cause e.g. a SIGSEGV (code 11) to be reported as "code 139", meaning "code (0x80 | 11)". * Since waitpid() isn't called with WSTOPPED nor WCONTINUED, there's no need to have any os.WIFSTOPPED or os.WIFCONTINUED handling.
According to the python signal documentation[1], SIGCHLD is handled differently from other signals. Specifically, if the underlying implementation resets the SIGCHLD signal handler, then python won't reinstall it (as it does for other signals). This behavior doesn't seem to exist for neither Linux nor Mac, but perhaps one could argue that it's good practise anyway. [1] https://docs.python.org/3/library/signal.html
* Look up handlers in __init__() to induce run-time error early on if something is wrong. * Since we now know that all handlers exist, we can simplify the main loop in arbiter, in such a way that we don't need to call wakeup(). So after this commit, the pipe in arbiter is only used to deliver which signal was sent.
It accepts an optional "due_to_signal" argument which can be used to tell if the wakeup was made because a signal handler needs to be executed or not.
@pajod I've updated/synced my two PRs now, so hopefully they should (if they are accepted) merge cleanly! |
fd21207
to
2ce2976
Compare
I tested your integration branch a little bit, and I found a bug in my code where I tested the scenario of sending SIGHUP to the arbiter while running a gthread type worker. The following hunk needs to be applied: diff --git a/gunicorn/workers/gthread.py b/gunicorn/workers/gthread.py
index 196759b8..b47ddaef 100644
--- a/gunicorn/workers/gthread.py
+++ b/gunicorn/workers/gthread.py
@@ -118,8 +118,9 @@ class ThreadWorker(base.Worker):
return futures.ThreadPoolExecutor(max_workers=self.cfg.threads)
def handle_exit(self, sig, frame):
- self.alive = False
- self.method_queue.defer(lambda: None) # To wake up poller.select()
+ if self.alive:
+ self.alive = False
+ self.method_queue.defer(lambda: None) # To wake up poller.select()
def handle_quit(self, sig, frame):
self.thread_pool.shutdown(False) With this in place, I can spam SIGHUP to the arbiter while running I will update my pull request later today! |
The main purpose is to remove complexity from gthread by: * Removing the lock for handling self._keep and self.poller. This is possible since we now do all such manipulation on the main thread instead. When a connection is done, it posts a callback through the PollableMethodCaller which gets executed on the main thread. * Having a single event queue (self.poller), as opposed to also managing a set of futures. This fixes benoitc#3146 (although there are more minimal ways of doing it). There are other more minor things as well: * Renaming some variables, e.g. self._keep to self.keepalived_conns. * Remove self-explanatory comments (what the code does, not why). * Just decide that socket is blocking. * Use time.monotonic() for timeouts in gthread. Some complexity has been added to the shutdown sequence, but hopefully for good reason: it's to make sure that all already accepted connections are served within the grace period.
Now it's fixed! Sorry for any trouble. |
2ce2976
to
2429360
Compare
3a0198d
to
eb55394
Compare
There is not need to excessively produce high-severity logs, while there are ways to reach that point from entirely benign network errors. This partially reverts commit 0b10cba.
workaround greenlet installation on OmniOS v11: libev assumes inotify.h must be Linux. make autoconf stop offering it.
…'benoitc/pr/3273', 'benoitc/pr/3271', 'benoitc/pr/3275', 'benoitc/pr/3148', 'benoitc/pr/3250', 'benoitc/pr/3264', 'benoitc/pr/3210', 'benoitc/pr/3272' and 'pajod/docs-misc' into integration-v23.1.0
eb55394
to
325c802
Compare
Any ETA on 23.1.0 bugfix release? |
This is where I put together various PRs.
The following PRs merge OK:
--reload-extra-files
without--reload
benoitc/gunicorn#3271These merge conflicts are straightforward to resolve:
I recommend most of the above go into a
23.1.0
bugfix release.These patches are included to help testing here, albeit not quite ready:
Please do not actually octopus merge though, sequential merges produce clearer output in common git tools. Created using this script: