-
Notifications
You must be signed in to change notification settings - Fork 824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should object store retry on connection reset by peer? #5378
Comments
We probably could retry, but connection reset normally means you are hitting rate limits and should reduce the amount of concurrent IO you are performing. There is a LimitStore that might achieve this, if polars can't do this itself |
Hmm, I've tried reducing concurrency in polars (to 32) and it doesn't seem to fix this, though I doubt i'm anywhere near the stated rate limit of s3 which is 5500 GET requests per partitioned prefix per second. If it's not incorrect behavior, do you think it makes sense to also build this condition into the retries done by object store @tustvold ? |
Do you think it'd be alright to modify: to: Which would be much more aggressive wrt what states to retry on? Ideally the error Whilst this might be more aggressive than desired, I think users could accordingly adjust their retries/backoff config to compensate, what do you think? |
Opened a PR #5383, feel free to close it if you think it's unreasonable. |
The conclusion of pola-rs/polars#14598 appears to be that this was an upstream issue, so closing this. Feel free to reopen if I am mistaken |
I've been running into something like this - apparently Azure has a 30 second request duration limit, and closes the connection afterwards. I opened #6287 for this. |
Which part is this question about
Object store
Describe your question
Should object store's retry logic also retry when the connection is reset by a peer?
Additional context
I'm querying an s3 bucket via polars (which uses object store under the hood) and I'm encountering this issue, this only happens when I'm querying many (~10k files)
This seems to be because this error isn't covered under object store's retry policy.
It'd be very handy if this was, though I'm not certain if it'd be strictly correct behavior (ie. Should it be handled by the caller of object store instead? Which in this case is polars).
The text was updated successfully, but these errors were encountered: