Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We must replace uwsgi by something else #937

Open
regisb opened this issue Nov 14, 2023 · 14 comments
Open

We must replace uwsgi by something else #937

regisb opened this issue Nov 14, 2023 · 14 comments
Labels
enhancement Enhancements will be processed by decreasing priority

Comments

@regisb
Copy link
Contributor

regisb commented Nov 14, 2023

uwsgi is now in maintenance mode: https://uwsgi-docs.readthedocs.io/en/latest/

The project is in maintenance mode (only bugfixes and updates for new languages apis). Do not expect quick answers on github issues and/or pull requests (sorry for that) A big thanks to all of the users and contributors since 2009.

So we should take the opportunity that we are releasing Quince to remove uwsgi from the code base.

@regisb regisb added the enhancement Enhancements will be processed by decreasing priority label Nov 14, 2023
@DawoudSheraz DawoudSheraz self-assigned this Jan 2, 2024
@DawoudSheraz
Copy link
Contributor

Before jumping into the context, the following 3 terms are very different despite having similar names:

  • WSGI (Web Server Gateway Interface) - it is a specification that defines the interfacing between Web server and an application. It is used in Open edX backend apps, via wsgi.py file.
  • uWSGI - a utility/package that allows building hosting services compliant with WSGI standards (this is uwsgi this issue is talking about). It offers many things like HTTP server, proxies, process managers, monitoring, etc.
  • uwsgi - a binary protocol used by uWSGI to talk to other web servers (like nginx)

Without going deeper into various utils, the top results that could be alternatives to uWSGI include:

Upon looking in tutor, I found out gunicorn was used before uWSGI. The migration to uWSGI was done when upgrading to Koa (728ef96). Not sure if we can still revert to gunicorn in current state. Compared with uWSGI, gunicorn is slow (some comparison is on https://emptyhammock.com/projects/info/pyweb/gunicorn.html, however it might not be exactly true).

mod_wsgi is mainly Apache module. There were some references of nginx mod_wsgi but I was not able to find any latest information on that. CherryPy is both a Web server and a minimalistic Web framework. Waitress, on the other hand, is WSGI only server running on PyPy on Unix (python 3.7+).

I will be looking into each item and further explore any other alternatives as well.

References

  1. https://stackoverflow.com/questions/38601440/what-is-the-point-of-uwsgi
  2. https://www.ultravioletsoftware.com/single-post/2017/03/23/An-introduction-into-the-WSGI-ecosystem
  3. https://medium.com/django-deployment/which-wsgi-server-should-i-use-a70548da6a83
  4. Looks like this is now dead? unbit/uwsgi#2425
  5. 728ef96
  6. https://emptyhammock.com/projects/info/pyweb/not-uwsgi.html

@regisb
Copy link
Contributor Author

regisb commented Feb 13, 2024

I'm thinking that a possible alternative would be to revert back to gunicorn, but launch a separate container with a web server (such as caddy or nginx) to serve static assets. I hate the fact that we would have to launch a new container just for that (and for every other web app...), but it's the only solution that I'm seeing right now.

@blarghmatey
Copy link

An alternative that I'm looking at for our WSGI server uses is nginx-unit, but I haven't done any testing of that yet.

@ormsbee
Copy link
Contributor

ormsbee commented Apr 28, 2024

Hi folks. I was pointed to this ticket after I posted a forums question about the performance implications of serving static assets with uWSGI.

@DawoudSheraz:

Compared with uWSGI, gunicorn is slow (some comparison is on https://emptyhammock.com/projects/info/pyweb/gunicorn.html, however it might not be exactly true).

I feel like throughput benchmarks aren't really that useful for us. Even the slowest thing there shows gunicorn with 4 worker processes giving a throughput of roughly 3200 requests per second, so ~800 req/s per worker, which means the overhead it's imposing for spawning a request and reading/writing 70K is something like 1.25 ms. I think that even our fastest web transactions run in the 30-50 ms range, with more common ones running 150+ ms, and a number of painful courseware calls running multiple seconds. Whether the WSGI server is imposing 1.25 ms of overhead or 0.4 ms of overhead won't really be noticeable in that context.

@regisb:

I hate the fact that we would have to launch a new container just for that (and for every other web app...), but it's the only solution that I'm seeing right now.

I'm still really ignorant about Docker things. Would the idea be that we'd have one static asset server container running Caddy, and that there's some shared volume where each Django service writes its static files to a different sub-directory? Or a volume per service, with the Caddy static assets container having an entry for each?

@blarghmatey:

An alternative that I'm looking at for our WSGI server uses is nginx-unit, but I haven't done any testing of that yet.

My main hesitation with this is that gunicorn is so ubiquitous for serving Python apps that a lot tooling will accommodate it out of the box. For instance, New Relic would give you things like worker availability, utilization, restart events, etc. I imagine many other APMs do the same. Gunicorn is also going to be more familiar for Python developers and easier to find solutions for common problems via StackOverflow and the like.

@DawoudSheraz DawoudSheraz removed their assignment May 3, 2024
@DawoudSheraz
Copy link
Contributor

@ormsbee Hi, thanks for the context. I have not had time to continue this issue. There is some missing context for me which some experimentation (with gunicorn and uwsgi) will help provide.

@ormsbee
Copy link
Contributor

ormsbee commented Jun 5, 2024

I'll add another long term alternative to uwsgi: granian. It's implemented in Rust using the hyper library. It's selling points are highly consistent response times (much smaller 99th percentile deviations, because the networking i/o stack is on the Rust side), and the ability to have a single server do ASGI and WSGI (along with its own RSGI standard that it's trying to push).

I don't think it's appropriate for us at this time. It's a single developer effort, and it has no support for "restart the worker after X requests", which we unfortunately need because of memory leaks. I merely mention it as something to keep an eye on in the longer term.

@regisb
Copy link
Contributor Author

regisb commented Jun 7, 2024

it has no support for "restart the worker after X requests", which we unfortunately need because of memory leaks.

Do you have an idea of what is causing these memory leaks?

Note: here's the upstream granian issue: emmett-framework/granian#34

@ormsbee
Copy link
Contributor

ormsbee commented Jun 7, 2024

The last time someone investigated this, a lot of it owed to circular references in the XBlock runtimes and modulestore. I haven't looked into it since the major runtime simplification or old mongo removal PRs landed.

@regisb
Copy link
Contributor Author

regisb commented Jun 13, 2024

For the record, the uWSGI configuration that currently ships with Tutor for the LMS/CMS containers do not make use of the "restart the worker after X requests" flag (max-requests). Tutor users are not complaining, so... I guess we don't need it?

@ormsbee
Copy link
Contributor

ormsbee commented Jun 14, 2024

For the record, the uWSGI configuration that currently ships with Tutor for the LMS/CMS containers do not make use of the "restart the worker after X requests" flag (max-requests). Tutor users are not complaining, so... I guess we don't need it?

I hope that means the problem has gotten a lot better. I don't know what value it's set to for 2U these days, but we definitely had issues with it in the past and that's why there's a slot for it in the configuration repo. I suppose it's also possible that it's an issue with gunicorn specifically that's causing this...

In any case, FYI to @timmc-edx, @robrap, @dianakhuang in case this is of interest for how 2U deploys.

@robrap
Copy link

robrap commented Jun 18, 2024

Thanks @ormsbee. I think we still have this set for edxapp in Stage, Prod and Edge. I don't think we use it for any of our other workers. However, it's likely to remain this way, given the old "if it ain't broke, don't fix it", along with all of our other priorities. We've had too many recent incidents from changes (mostly DD related) that seem like they shouldn't cause any issues, but surprise surprise.

@regisb
Copy link
Contributor Author

regisb commented Sep 18, 2024

FYI this issue was featured on episode 401 of the Python Bytes podcast: https://www.youtube.com/watch?v=XKI5gtnKMus

@Faraz32123 Faraz32123 removed their assignment Sep 18, 2024
@CodeWithEmad
Copy link
Member

CodeWithEmad commented Sep 23, 2024

What about uvicorn? I used it in some fastapi projects. It supports async and apparently, it's fast!

@ormsbee
Copy link
Contributor

ormsbee commented Sep 23, 2024

My main concern with uvicorn is that it doesn't support WSGI. Its native WSGI implementation is deprecated, and they point you to a2wsgi instead. Maybe that will work smoothly, but running WSGI apps doesn't seem to be an area of focus for them, and we're probably going to be running in that mode for a while to come.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancements will be processed by decreasing priority
Projects
Development

No branches or pull requests

7 participants