-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Library changes for Apache Arrow integration #16691
Library changes for Apache Arrow integration #16691
Conversation
Hello! |
❌ Gradle check result for c4d0735: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
59c0acd
to
fe262f2
Compare
❌ Gradle check result for fe262f2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Rishabh Maurya <[email protected]>
fe262f2
to
3295caf
Compare
❌ Gradle check result for 3295caf: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Rishabh Maurya <[email protected]>
Signed-off-by: Rishabh Maurya <[email protected]>
❕ Gradle check result for dca6d1f: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
libs/arrow-spi/src/main/java/org/opensearch/arrow/spi/StreamTicket.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Rishabh Maurya <[email protected]>
Signed-off-by: Rishabh Maurya <[email protected]>
cccfa05
to
49b701e
Compare
libs/arrow-spi/src/main/java/org/opensearch/arrow/spi/StreamTicket.java
Outdated
Show resolved
Hide resolved
libs/arrow-spi/src/main/java/org/opensearch/arrow/spi/StreamTicket.java
Outdated
Show resolved
Hide resolved
libs/arrow-spi/src/main/java/org/opensearch/arrow/spi/StreamTicketFactory.java
Outdated
Show resolved
Hide resolved
A few nits , but LGTM otherwise @rishabhmaurya ! I think the SPI is looking good to start with (certainly will be evolving but this is fine). |
Signed-off-by: Rishabh Maurya <[email protected]>
@reta I agree! StreamReader is the one which will evolve most, its very minimal right now. I appreciate your feedbacks in improving the APIs, looks a lot better now. I'm looking forward to your reviews for the next PR related to server bootstrap logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrross do you want to take a look? thanks!
@rishabhmaurya Just curious, how are you going to handle the Guava dependency that Arrow has? The code currently forbids the server module from taking a Guava dependency (presumably to prevent dependency conflicts with plugins that may require a different version?). Will you be able to isolate that dependency to this future |
@rishabhmaurya Should we backport to 2.x? |
* Library changes for arrow integration Signed-off-by: Rishabh Maurya <[email protected]> * Bump guava 32->33 Signed-off-by: Rishabh Maurya <[email protected]> * add support for onCancel and Cancellable for BatchedJob in lib:arrow module Signed-off-by: Rishabh Maurya <[email protected]> * address PR comments Signed-off-by: Rishabh Maurya <[email protected]> * Move StreamTicket to an interface Signed-off-by: Rishabh Maurya <[email protected]> * remove jackson dependencies Signed-off-by: Rishabh Maurya <[email protected]> * make sl4j runtime only Signed-off-by: Rishabh Maurya <[email protected]> * introduce factory for stream ticket Signed-off-by: Rishabh Maurya <[email protected]> * Address PR comments Signed-off-by: Rishabh Maurya <[email protected]> --------- Signed-off-by: Rishabh Maurya <[email protected]> (cherry picked from commit 6d3fd37) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@andrross yes, i might be wrong but its mostly arrow flight modules and not arrow vector and memory which needs guava dependencies and thus it will just be isolated to the
yes, just added the label. |
@rishabhmaurya I'm guessing Guava comes transitively via the grpc dependencies, which are probably isolated to the flight modules. |
* Library changes for arrow integration * Bump guava 32->33 * add support for onCancel and Cancellable for BatchedJob in lib:arrow module * address PR comments * Move StreamTicket to an interface * remove jackson dependencies * make sl4j runtime only * introduce factory for stream ticket * Address PR comments --------- (cherry picked from commit 6d3fd37) Signed-off-by: Rishabh Maurya <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Description
Library changes for Apache Arrow integration. Lib changes just exposes POJOs for creation of
StreamProducer
and registering them withStreamManager
.StreamProducer
in turn exposesBatchedJob
which are based on creation and filling Arrow Vectors in a batched manner handling client backpressure. So the arrow APIs exposed are kept minimal, limited to -server module will depend on
libs:arrow
to create and registerStreamProducer
for search results by populating vectors with a well defined schema for search results.Future PR will contain a module
modules:arrow-flight-rpc
, which will actually expose the Arrow Flight server, client and actual logic to create FlightStream. It will be a bulky modules in terms of all its direct and transitive dependencies.Related Issues
Resolves #16679
Check List
- [ ] API changes companion pull request created, if applicable.- [ ] Public documentation issue/PR created, if applicable.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.