-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
async S3 HTTP with botornado #14
Conversation
Thanks for the patch! My only concern with merging this in is backwards compatibility of botornado with boto. Aside from installing botornado is this a drop in replacement for the existing plugin, or are there additional steps needed for installation? A good solution may be to have a config option allowing people to switch between the two. |
I don't write much python, so I've barely used boto, and I'd never heard of botornado until I found it trying to fix this. I couldn't comment on backwards compatibility except that “it seems to work” :) I'm not sure about the config option — I think using a blocking client like boto in an async environment like tornado is a serious bug. I would suggest tagging the current version as v1.0.0 with a big warning in the README about it blocking, and then integrate botornado and release it as v2.0.0. That way people can deliberately upgrade, or stay on the blocking version if it's working for them and they're concerned about incompatibility. |
I think this needs to be implemented. Could you merge this in. |
I discovered that this wasn't loading AWS credentials from the environment in the expected way. So — I'm not sure about this PR, it might need more work, but I'm not using it anymore. |
@kartikluke If the original images are in a public readable S3 bucket, then the HTTP loader is perfect. But if the original images aren't public readable, then S3 credentials are required… as far as I know the HTTP loader doesn't know how to do that…? |
Ah that's correct. Sorry. Got confused. On Tue 24 Feb, 2015 00:08 Paul Annesley [email protected] wrote:
|
* hashed path in S3 prevent to serach by prefix in keys. Without real advantage (not a real FS) * add vows based storage tests * add .gitignore for *.pyc
* hashed path in S3 prevent to serach by prefix in keys. Without real advantage (not a real FS) * add vows based storage tests * add .gitignore for *.pyc
* hashed path in S3 prevent to serach by prefix in keys. Without real advantage (not a real FS) * add vows based storage tests * add .gitignore for *.pyc
Closing this; as I mentioned I wrote https://github.com/99designs/thumbor_botornado which we're using. Feel free to use the code from this PR or that |
Summary: thumbor runs on tornado which is async, but thumbor_aws uses boto for HTTP to S3 which is blocking, causing awful performance. This patch uses botornado, a tornado async partial port of boto, to make S3 requests non-blocking.
Test material at https://gist.github.com/pda/5dcf54410b696072fcbd
In production we have an ELB load balancer in front of two EC2 instances each running thumbor + thumbor_aws inside a docker container. We had reports of blank images, and then hints that they were HTTP 503 errors, but it wasn't reproducible. Our logs shows that the load balancer frequently marked the instances as unhealthy, meaning they've failed to respond to health checks. The thumbor processes had been up and running for many days, and had no suspicious log entries.
I suspected that something was blocking inside the thumbor event loop, and narrowed it down to boto in thumbor_aws.
Using a controllably slow backend behind thumbor_aws, I was able to demonstrate the single-concurrency of boto by showing slow requests blocking fast requests:
(The double times in the benchmarks are due to the test server sleeping
n milliseconds
for both the HEAD and GET requests. Perhaps that HEAD request should be eliminated from thumbor_aws, but that's a separate issue).Note that the responses are serialized into the order they arrived, and that the time is roughly cumulative — the final request which should have taken 1,000ms took 23,000ms.
Switching
s3_loader.py
fromboto
tobotornado
fixes that:This fix uses botornado to do async HTTP requests to S3 in a way that fits the tornado event loop model, without blocking other requests.
I've only applied it to the
loaders
module, as that's the only component that we're using. It should probably also be applied to thestorages
andresult_storages
modules. Perhaps we can ship this fix first, though?