Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable merge on refresh and merge on commit on Opensearch #2535

Merged
merged 4 commits into from
Mar 25, 2022

Conversation

reta
Copy link
Collaborator

@reta reta commented Mar 21, 2022

Signed-off-by: Andriy Redko [email protected]

Description

Enable merge on refresh and merge on commit on Opensearch. Luckily, Apache Lucene 9.1 introduces MergeOnFlushMergePolicy which we could use from now on, activated by index.max_full_flush_merge (expert) index setting. The experiments I have done using [1] and pmc track shows significant reduction of the segments for a price of more merges:

./rally race    --track pmc      --target-hosts localhost:9200     --pipeline benchmark-only     --kill-running-processes --exclude-tasks="force-merge,refresh-after-force-merge,wait-until-merges-finish"

Baseline: 2.0.0-SNAPSHOT
Contender: 2.0.0-SNAPSHOT + "index.max_full_flush_merge": "10s"

Metric Task Baseline Contender Diff Unit Diff %
...
Cumulative merge time of primary shards 33.7391 63.6844 29.9452 min +88.76%
Cumulative merge count of primary shards 58 180 122 +210.34%
...
Segment count 133 43 -90 -67.67%
...
Metric Task Baseline Contender Diff Unit Diff %
...
Cumulative merge time of primary shards 37.8788 61.7903 23.9114 min +63.13%
Cumulative merge count of primary shards 65 175 110 +169.23%
...
Segment count 129 42 -87 -67.44%
...

[1] https://github.com/opensearch-project/opensearch-benchmark

Issues Resolved

Closes #1345

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@reta reta requested a review from a team as a code owner March 21, 2022 18:51
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure a8c3de1084d485807d9a91cecbac567c3261006f
Log 3597

Reports 3597

@reta
Copy link
Collaborator Author

reta commented Mar 21, 2022

x Gradle Check failure a8c3de1 Log 3597

Reports 3597

#1561 :(

Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in the new settings constant ...FLASH_MERGE.

So I'm sure this MergePolicy will be promoted to lucene core but for now it's still sandboxed. Which means there are no bwc guarantees so we might consider feature flagging until it's a first class citizen?

/cc @mikemccand @msokolov any input on how much this policy might change that would give caution promoting it in OpenSearch as a first class citizen for 2.0?

@reta
Copy link
Collaborator Author

reta commented Mar 21, 2022

@nknize that is correct, it is sandboxed, I don't think we have a dedicated Property.Sandboxed scope for settings but we could introduce it to unlock experimental / sandboxed features.

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 37fb5b1ddd1b6d16c09aa2680d01ceee3fe65937
Log 3609

Reports 3609

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 7b0725cd18ffdb72129755fe935a9ce0d7c17439
Log 3616

Reports 3616

@msokolov
Copy link

The only planned change I'm aware of is to replace the tunable ( smallSegmentThresholdMB) and instead to derive that from pre-existing configs: https://issues.apache.org/jira/browse/LUCENE-10078. This is a newish addition, so it's possible there could be other changes, but so long as there isn't a lot of user-configuration provided, I don't think that should block this PR. I would be a little leery of exposing the time parameter, as opposed to a single boolean enable/disable parameter. Are users really going to know how to pick a good value for this?

@reta
Copy link
Collaborator Author

reta commented Mar 21, 2022

Are users really going to know how to pick a good value for this?

Thanks @msokolov ! We consider that to be an expert setting, exposing boolean instead would also be an issue - how to pick the "right" value for all possible index configurations? At least, with time value, there is no magic, the users could experiment. What do you think? Thank you!

@nknize nknize self-requested a review March 23, 2022 12:45
Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least, with time value, there is no magic, the users could experiment.

I think we should split the difference. Not many users will experiment with a setting like this; and it's only enabled if a time value is set. Which means most users won't use it (and reap the performance gains) unless we keep it simple. So (as much as I don't like increasing API surface area) let's do the following?:

  1. change index.max_full_flush_merge to index.merge_on_flush.max_full_flush_merge_wait_time (or something shorter?)
  2. add a new boolean setting index.merge_on_flush.enabled

Users can enable and reap the performance gains simply by setting index.merge_on_flush.enabled : true and the index.merge_on_flush.max_full_flush_merge_wait_time setting defaults to? 10s? 5s? (benchmark or test value?)
Users that want to experiment with index.merge_on_flush.max_full_flush_merge_wait_time can do so but we should probably document the tradeoffs.

Suggestions added inline

*/
public static final Setting<TimeValue> INDEX_MAX_FULL_FLUSH_MERGE = Setting.timeSetting(
"index.max_full_flush_merge",
new TimeValue(0, TimeUnit.MILLISECONDS),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You benchmarked with 10s; should this be a reasonable default? Or 5s like the tests? /cc @msokolov

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nknize the only heuristic which comes to mind is that it should not be larger than index.refresh_interval: may be we could cap it to be max(half of the refresh interval, 10s) by default (if not specified), wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a reasonable default. Let's add that to this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 will do that shortly

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original idea was to avoid creating lots of tiny segments while not interfering with the normal merge policy merges or other operations. The idea of the timeout was that we didn't want to block other operations for too long, since this merge-on-commit or merge-on-refresh is blocking the corresponding operation from completing. I don't know what refresh interval typically is set to - 1 minute maybe? That would lead to default of 10s. We've tested this with 30, and it works ok for our indexing workload, but YMMV. 10s seems safe to me

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks a lot @msokolov , 10s it be @nknize ?

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 56350d992dadbaee3b49864baa8c6fe8ec810608
Log 3698

Reports 3698

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 1058c75
Log 3757

Reports 3757

@nknize nknize self-requested a review March 25, 2022 18:41
@nknize nknize added enhancement Enhancement or improvement to existing feature or request Indexing & Search v3.0.0 Issues and PRs related to version 3.0.0 labels Mar 25, 2022
Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!! LGTM!

@nknize nknize merged commit 908682d into opensearch-project:main Mar 25, 2022
@nknize nknize added the backport 2.x Backport to 2.x branch label Mar 25, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request Mar 25, 2022
Enables merge on refresh and merge on commit in Opensearch by
way of two new index options:
index.merge_on_flush.max_full_flush_merge_wait_time and
index.merge_on_flush.enabled. Default merge_on_flush is disabled and
wait time is 10s.

Signed-off-by: Andriy Redko <[email protected]>
(cherry picked from commit 908682d)
nknize pushed a commit that referenced this pull request Mar 28, 2022
)

Enables merge on refresh and merge on commit in OpenSearch by
way of two new index options:
index.merge_on_flush.max_full_flush_merge_wait_time and
index.merge_on_flush.enabled. Default merge_on_flush is disabled and
wait time is 10s.

Signed-off-by: Andriy Redko <[email protected]>
(cherry picked from commit 908682d)
@reta reta added the v2.1.0 Issues and PRs related to version 2.1.0 label May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request Indexing & Search v2.1.0 Issues and PRs related to version 2.1.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable merge on refresh and merge on commit on Opensearch
5 participants