-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow individual shards to be targeted during query execution [FEATURE] #1478
Comments
@akuzin1 Thanks for the feature request! This recall an old request previously: opendistro-for-elasticsearch/sql#1151. @acarbonetto is adding OpenSearch meta field support to our SQL engine. I'm thinking can we support this as part of meta field work. |
Sounds good, that would be great to see. Is there an estimated timeline for when this feature would be released? |
This will be a dependency for #1441 |
related: #339 |
Hi! Saw that the label was changed to "Priority-High" which is great to see, so wanted to check in and see if there is an estimated date for delivering of this feature? Thank you. |
Target release is 2.10 at the moment. Hoping to get this in a little sooner. |
Proposal for setting GoalThe objective of the partition/routing shard is to include the routing ID in the SearchRequest builder. Example:
Getting the routing id(s) from the initial query into pushdown can take one of two obvious routes. Proposal 1: Request Parameter ("routing")Syntax addition, includes a new json parameter only available in the V2 engine.
This includes an opensearch-similar API to target individual or lists of shards. The string will accept a comma-separated list of shard ID targets. PoC Architecture ChangeTo get the request into OpenSearchRequest as part of pushdown requires that we create an AbstractPlan, LogicalPlan and PhysicalPlan operator like the Paginate and LogicalPaginate operators. This would allow us to pushdown a routingId string into the OpenSearchIndexScanQueryBuilder during execution. We would create Partition operators in a similar manner as the Pagination operators without much business logic. Reference to how paginate works: https://github.com/opensearch-project/sql/blob/main/docs/dev/Pagination-v2.md#unresolved-query-plan Considerations: we may consider combining Paginate and Partition into a single set of operators, and call them something like Pros
Cons
Proposal 2: SQL
|
Will there be a new projected release target for this feature, @acarbonetto? I see it was targeted for 2.11 and 2.12 is being released soon. Can this make 2.13, perhaps? |
Is your feature request related to a problem?
In MySQL one can retrieve partition information about a table, which can later be used to target specific partitions during query execution.
The following is an example of a query that can be used to retrieve partition information of a specific table.
Next is an example of a query targeting a specific partition in a table.
SELECT * FROM table PARTITION (partitionName);
When applying this to opensearch, partitions could be treated as the equivalent of shards for our use case.
What solution would you like?
It would be great to be able to treat shards in Opensearch as the equivalent to MySQL Partitions and be able to query individual shards.
What alternatives have you considered?
We've considered generating splits based on hashing a key or tuple of keys and then modulo that against some fixed number of splits that we want to generate.
However, it doesn't seem like there is a hashing function like that available.
Therefore, the above mentioned solution would be a great behavior to add for all sql users, to more closely mimic the behavior and syntax of MySQL.
The text was updated successfully, but these errors were encountered: