-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrency Handling #11
Comments
Hi @StephanGeorg thanks for opening this thread i'm sure it will interest others too. TLDR; let's find out. The extension simply wraps the DuckDB provided httplib-cpp implementation so everything about the library does apply regardless of it being currently leveraged. Right now the client does not attempt anything fancy (yet) and fires on invocation while the default throttle logic from the library would apply. We can build some test to attempt this with incremental load to find out if there are settings we might need to involve to tweaks limits and/or behavior. |
From my own quick exploration: a) there is no concurrent execution in this extension as it is currently implemented* *though DuckDB itself may split a query across multiple threads, this will only apply for queries that run over many thousands of rows, which is probably not relevant for most use cases of this particular extension. Below is an example of the HTTPPostRequestFunction that splits requests across threads; I confirmed in testing that this speeds up queries by approx thread count (~10X in my case); this is quick and dirty so there may be a better/more performant way to do this.
|
I am trying to understand how simultaneous HTTP requests are handled within this project and was hoping for some guidance. Specifically, I am interested in knowing:
Additional Context:
I'm using a
MACRO
to do the requestsand then apply it to all rows of a table
If
/path/to/input.csv
would have millions of rows, are all requests fired at the same time?The text was updated successfully, but these errors were encountered: