Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Filtering on Large List encoded by Bitmap #14774

Merged
merged 35 commits into from
Aug 20, 2024
Merged
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
29b85ea
experiment
bowenlan-amzn Jul 2, 2024
e4e42b3
Merge branch 'main' into 12341-bitmap-filtering
bowenlan-amzn Jul 10, 2024
839d7c9
Merge branch 'main' into 12341-bitmap-filtering
bowenlan-amzn Jul 15, 2024
63f1cd4
sketchy implementation
bowenlan-amzn Jul 16, 2024
7b360b6
support value_type param in terms query
bowenlan-amzn Jul 19, 2024
03ff662
support new parameter to fetch stored field in terms lookup
bowenlan-amzn Jul 20, 2024
deeb3ee
example yaml test
bowenlan-amzn Jul 20, 2024
b388d59
updateSHAs and License
bowenlan-amzn Jul 20, 2024
577e6d0
bwc in transport stream
bowenlan-amzn Jul 29, 2024
2e67647
more rest tests
bowenlan-amzn Jul 30, 2024
04789e5
Merge branch 'main' into 12341-bitmap-filtering
bowenlan-amzn Jul 30, 2024
3cd735e
more rest tests
bowenlan-amzn Jul 30, 2024
b9bf2d4
small fix
bowenlan-amzn Jul 30, 2024
1de60d0
support index type bitmap filter
bowenlan-amzn Jul 31, 2024
ff524fb
fill in BitMapFilterQuery
bowenlan-amzn Jul 31, 2024
cebba71
skip version before 2.17 in yaml test
bowenlan-amzn Jul 31, 2024
b9280be
self review
bowenlan-amzn Aug 1, 2024
2dc4509
add default value type
bowenlan-amzn Aug 1, 2024
0eef2fb
Merge branch 'main' into 12341-bitmap-filtering
bowenlan-amzn Aug 1, 2024
a82a896
I really like unit test
bowenlan-amzn Aug 1, 2024
b18b44e
remove null from mock get result of field
bowenlan-amzn Aug 2, 2024
51e8b08
wrap up for 2.17
bowenlan-amzn Aug 2, 2024
eaab235
changelog
bowenlan-amzn Aug 2, 2024
c039da4
bwc
bowenlan-amzn Aug 2, 2024
dd30785
increase coverage
bowenlan-amzn Aug 2, 2024
7028835
increase coverage
bowenlan-amzn Aug 3, 2024
fd4b1b6
Merge branch 'main' into 12341-bitmap-filtering
bowenlan-amzn Aug 3, 2024
6e56297
possible performance improvements
bowenlan-amzn Aug 5, 2024
28654ee
improve
bowenlan-amzn Aug 6, 2024
7700b75
Merge branch 'main' into 12341-bitmap-filtering-push
bowenlan-amzn Aug 6, 2024
7704005
Merge branch 'main' into 12341-bitmap-filtering
bowenlan-amzn Aug 6, 2024
3bce711
Merge branch '12341-bitmap-filtering-push' into 12341-bitmap-filtering
bowenlan-amzn Aug 6, 2024
8e1feb0
Merge branch 'main' into 12341-bitmap-filtering
bowenlan-amzn Aug 8, 2024
c51dfc8
handle empty bitmap rewrite scenarios
bowenlan-amzn Aug 8, 2024
9c1c039
Merge branch 'main' into 12341-bitmap-filtering
msfroh Aug 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- Add `rangeQuery` and `regexpQuery` for `constant_keyword` field type ([#14711](https://github.com/opensearch-project/OpenSearch/pull/14711))
- Add took time to request nodes stats ([#15054](https://github.com/opensearch-project/OpenSearch/pull/15054))
- [Workload Management] QueryGroup resource tracking framework changes ([#13897](https://github.com/opensearch-project/OpenSearch/pull/13897))
- Support filtering on a large list encoded by bitmap ([#14774](https://github.com/opensearch-project/OpenSearch/pull/14774))

### Dependencies
- Bump `netty` from 4.1.111.Final to 4.1.112.Final ([#15081](https://github.com/opensearch-project/OpenSearch/pull/15081))
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
---
setup:
- skip:
version: " - 2.99.99"
reason: The bitmap filtering feature is available in 2.17 and later.
- do:
indices.create:
index: students
body:
settings:
number_of_shards: 1
number_of_replicas: 0
mappings:
properties:
student_id:
type: integer
- do:
bulk:
refresh: true
body:
- { "index": { "_index": "students", "_id": "1" } }
- { "name": "Jane Doe", "student_id": 111 }
- { "index": { "_index": "students", "_id": "2" } }
- { "name": "Mary Major", "student_id": 222 }
- { "index": { "_index": "students", "_id": "3" } }
- { "name": "John Doe", "student_id": 333 }
- do:
indices.create:
index: classes
body:
settings:
number_of_shards: 1
number_of_replicas: 0
mappings:
properties:
enrolled:
type: binary
store: true
- do:
bulk:
refresh: true
body:
- { "index": { "_index": "classes", "_id": "101" } }
- { "enrolled": "OjAAAAEAAAAAAAEAEAAAAG8A3gA=" } # 111,222
- { "index": { "_index": "classes", "_id": "102" } }
- { "enrolled": "OjAAAAEAAAAAAAAAEAAAAG8A" } # 111
- { "index": { "_index": "classes", "_id": "103" } }
- { "enrolled": "OjAAAAEAAAAAAAAAEAAAAE0B" } # 333
- { "index": { "_index": "classes", "_id": "104" } }
- { "enrolled": "OjAAAAEAAAAAAAEAEAAAAN4ATQE=" } # 222,333
- do:
cluster.health:
wait_for_status: green

---
"Terms lookup on a binary field with bitmap":
- do:
search:
rest_total_hits_as_int: true
index: students
body: {
"query": {
"terms": {
"student_id": {
"index": "classes",
"id": "101",
"path": "enrolled",
"store": true
},
"value_type": "bitmap"
}
}
}
- match: { hits.total: 2 }
- match: { hits.hits.0._source.name: Jane Doe }
- match: { hits.hits.0._source.student_id: 111 }
- match: { hits.hits.1._source.name: Mary Major }
- match: { hits.hits.1._source.student_id: 222 }

---
"Terms query accepting bitmap as value":
- do:
search:
rest_total_hits_as_int: true
index: students
body: {
"query": {
"terms": {
"student_id": ["OjAAAAEAAAAAAAEAEAAAAG8A3gA="],
"value_type": "bitmap"
}
}
}
- match: { hits.total: 2 }
- match: { hits.hits.0._source.name: Jane Doe }
- match: { hits.hits.0._source.student_id: 111 }
- match: { hits.hits.1._source.name: Mary Major }
- match: { hits.hits.1._source.student_id: 222 }

---
"Boolean must bitmap filtering":
- do:
search:
rest_total_hits_as_int: true
index: students
body: {
"query": {
"bool": {
"must": [
{
"terms": {
"student_id": {
"index": "classes",
"id": "101",
"path": "enrolled",
"store": true
},
"value_type": "bitmap"
}
}
],
"must_not": [
{
"terms": {
"student_id": {
"index": "classes",
"id": "104",
"path": "enrolled",
"store": true
},
"value_type": "bitmap"
}
}
]
}
}
}
- match: { hits.total: 1 }
- match: { hits.hits.0._source.name: Jane Doe }
- match: { hits.hits.0._source.student_id: 111 }

---
"Boolean should bitmap filtering":
- do:
search:
rest_total_hits_as_int: true
index: students
body: {
"query": {
"bool": {
"should": [
{
"terms": {
"student_id": {
"index": "classes",
"id": "101",
"path": "enrolled",
"store": true
},
"value_type": "bitmap"
}
},
{
"terms": {
"student_id": {
"index": "classes",
"id": "104",
"path": "enrolled",
"store": true
},
"value_type": "bitmap"
}
}
]
}
}
}
- match: { hits.total: 3 }
- match: { hits.hits.0._source.name: Mary Major }
- match: { hits.hits.0._source.student_id: 222 }
- match: { hits.hits.1._source.name: Jane Doe }
- match: { hits.hits.1._source.student_id: 111 }
- match: { hits.hits.2._source.name: John Doe }
- match: { hits.hits.2._source.student_id: 333 }
3 changes: 3 additions & 0 deletions server/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,9 @@ dependencies {
api "com.google.protobuf:protobuf-java:${versions.protobuf}"
api "jakarta.annotation:jakarta.annotation-api:${versions.jakarta_annotation}"

// https://mvnrepository.com/artifact/org.roaringbitmap/RoaringBitmap
implementation 'org.roaringbitmap:RoaringBitmap:1.1.0'

testImplementation(project(":test:framework")) {
// tests use the locally compiled version of server
exclude group: 'org.opensearch', module: 'server'
Expand Down
1 change: 1 addition & 0 deletions server/licenses/RoaringBitmap-1.1.0.jar.sha1
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
9607213861158ae7060234d93ee9c9cb19f494d1
Loading
Loading