-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable merge on refresh and merge on commit on Opensearch #2535
Conversation
❌ Gradle Check failure a8c3de1084d485807d9a91cecbac567c3261006f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in the new settings constant ...FLASH_MERGE
.
So I'm sure this MergePolicy will be promoted to lucene core but for now it's still sandboxed. Which means there are no bwc guarantees so we might consider feature flagging until it's a first class citizen?
/cc @mikemccand @msokolov any input on how much this policy might change that would give caution promoting it in OpenSearch as a first class citizen for 2.0?
@nknize that is correct, it is sandboxed, I don't think we have a dedicated |
✅ Gradle Check success 37fb5b1ddd1b6d16c09aa2680d01ceee3fe65937 |
✅ Gradle Check success 7b0725cd18ffdb72129755fe935a9ce0d7c17439 |
The only planned change I'm aware of is to replace the tunable ( |
Thanks @msokolov ! We consider that to be an expert setting, exposing boolean instead would also be an issue - how to pick the "right" value for all possible index configurations? At least, with time value, there is no magic, the users could experiment. What do you think? Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least, with time value, there is no magic, the users could experiment.
I think we should split the difference. Not many users will experiment with a setting like this; and it's only enabled if a time value is set. Which means most users won't use it (and reap the performance gains) unless we keep it simple. So (as much as I don't like increasing API surface area) let's do the following?:
- change
index.max_full_flush_merge
toindex.merge_on_flush.max_full_flush_merge_wait_time
(or something shorter?) - add a new boolean setting
index.merge_on_flush.enabled
Users can enable and reap the performance gains simply by setting index.merge_on_flush.enabled : true
and the index.merge_on_flush.max_full_flush_merge_wait_time
setting defaults to? 10s
? 5s
? (benchmark or test value?)
Users that want to experiment with index.merge_on_flush.max_full_flush_merge_wait_time
can do so but we should probably document the tradeoffs.
Suggestions added inline
*/ | ||
public static final Setting<TimeValue> INDEX_MAX_FULL_FLUSH_MERGE = Setting.timeSetting( | ||
"index.max_full_flush_merge", | ||
new TimeValue(0, TimeUnit.MILLISECONDS), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You benchmarked with 10s
; should this be a reasonable default? Or 5s
like the tests? /cc @msokolov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nknize the only heuristic which comes to mind is that it should not be larger than index.refresh_interval
: may be we could cap it to be max(half of the refresh interval, 10s) by default (if not specified), wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a reasonable default. Let's add that to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 will do that shortly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original idea was to avoid creating lots of tiny segments while not interfering with the normal merge policy merges or other operations. The idea of the timeout was that we didn't want to block other operations for too long, since this merge-on-commit or merge-on-refresh is blocking the corresponding operation from completing. I don't know what refresh interval typically is set to - 1 minute maybe? That would lead to default of 10s. We've tested this with 30, and it works ok for our indexing workload, but YMMV. 10s seems safe to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Gradle Check success 56350d992dadbaee3b49864baa8c6fe8ec810608 |
Signed-off-by: Andriy Redko <[email protected]>
Signed-off-by: Andriy Redko <[email protected]>
… enable commit-on-flush Signed-off-by: Andriy Redko <[email protected]>
…erty Signed-off-by: Andriy Redko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!! LGTM!
Enables merge on refresh and merge on commit in Opensearch by way of two new index options: index.merge_on_flush.max_full_flush_merge_wait_time and index.merge_on_flush.enabled. Default merge_on_flush is disabled and wait time is 10s. Signed-off-by: Andriy Redko <[email protected]> (cherry picked from commit 908682d)
) Enables merge on refresh and merge on commit in OpenSearch by way of two new index options: index.merge_on_flush.max_full_flush_merge_wait_time and index.merge_on_flush.enabled. Default merge_on_flush is disabled and wait time is 10s. Signed-off-by: Andriy Redko <[email protected]> (cherry picked from commit 908682d)
Signed-off-by: Andriy Redko [email protected]
Description
Enable merge on refresh and merge on commit on Opensearch. Luckily, Apache Lucene 9.1 introduces
MergeOnFlushMergePolicy
which we could use from now on, activated byindex.max_full_flush_merge
(expert) index setting. The experiments I have done using [1] andpmc
track shows significant reduction of the segments for a price of more merges:Baseline: 2.0.0-SNAPSHOT
Contender: 2.0.0-SNAPSHOT + "index.max_full_flush_merge": "10s"
[1] https://github.com/opensearch-project/opensearch-benchmark
Issues Resolved
Closes #1345
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.