perf: spawn sync parquet write on blocking runtime #2806
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
In our service we have tried to dance around the fact the underlying Delta/PartitionWriter runs the synchronous ArrowWriter write within an async method, blocking a runtime thread. This PR allows you to opt into to supplying a runtime to the DeltaWriter to spawn blocking tasks on.
I intend to clean up some of the code by implementing some of the methods on WriterState enum and write some tests, but want to get initial feedback.
Using this in our service with its own runtime has simplified our code and kept the main runtime free for io and incoming requests.
Related Issue(s)
Documentation