Skip to content

[Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap #14964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

nipunbatra8
Copy link

@nipunbatra8 nipunbatra8 commented Jul 17, 2025

Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap

This draft PR introduces a prototype BandwidthCappedMergeScheduler, which extends ConcurrentMergeScheduler to enforce a global bandwidth cap on merge operations across all active merges within an IndexWriter. The scheduler is inspired by the issue/discussion in lucene#14148 and feedback from @mikemccand and others. The motivation is to provide a simple, global bandwidth cap for merge operations, especially useful during "update storms" or "war time" scenarios where aggressive merging can cause page faults.

Implementation

  • Introduces BandwidthCappedMergeScheduler in lucene.index
  • Extends ConcurrentMergeScheduler to reuse merge management and threading
  • Implements a global bandwidth cap by dividing a configurable MB/s limit among all active merges
  • Overrides updateMergeThreads() to dynamically adjust per-merge IO rates

Initially, I experimented with skipping merges in the scheduler (by aborting or refusing to run merges that would exceed the bandwidth cap). However, this approach proved problematic:

  • Resource Leaks: By the time the scheduler can skip a merge, resources/files may already be allocated, leading to leaks or inconsistent state.
  • Merge Policy Coordination: Skipping merges late in the scheduler can result in better merges being blocked or waiting indefinitely, and it’s too late to make an informed decision without communicating with the IndexWriter or the MergePolicy.
  • Complexity: Trying to abort merges at the scheduler level introduces complexity and potential for subtle bugs, especially around resource cleanup and merge queue management.

CMS IO Rate Limiter?

  • ConcurrentMergeScheduler (CMS): Applies adaptive IO throttling to each merge thread individually, so total bandwidth may exceed a set limit when multiple merges run.
  • BandwidthCappedMergeScheduler: Divides a single global bandwidth cap among all active merges, ensuring their combined IO rate never exceeds the configured maximum.

Should We Integrate with CMS IO Rate Limiter?

One open question: should we integrate CMS’s adaptive IO rate logic into this scheduler as well, or something like switch to the global cap only during "war time", such as:

  • Use CMS’s adaptive per-merge throttling during "peace time" (normal operation).
  • Switch to the global bandwidth cap when the system detects "war time" (e.g., max merges/threads reached, or backlog detected).

Future Improvements

For now, the implementation is intentionally simple. However, there are several ways to make it more efficient and fair:

  1. Sort merges by size: Prioritize smaller merges or allocate bandwidth proportionally to merge size.
  2. Priority-based throttling: Give more bandwidth to urgent merges (e.g., forceMerge, high delete reclaim).
  3. Dynamic feedback: Adjust cap/rates based on system load actually consumed by merges
  4. Merge policy integration: Communicate bandwidth usage back to the MergePolicy to influence which merges are selected, or even skip merges earlier in the process.

Testing

  • I have tested this scheduler by replacing CMS in LuceneTestCase.java and running the full test suite.
  • Next, I plan to edit NRTPerfTest to simulate update storms and then generate segment traces to visualize bandwidth usage and merge behavior. Hopefully see a less spiky segment tracing graph.
  • Request for feedback: Are there other suggestions for tests or benchmarks to monitor this?

Next Steps

  • Discuss with the community
  • Testing to visualize results as mentioned in the Testing section
  • Efficiency improvements: Explore more advanced bandwidth allocation strategies as described above in future improvements.
  • Merge policy integration: Investigate ways to skip or prioritize merges earlier, possibly by editing IndexWriter or implementing a BandwidthAwareTieredMergePolicy that can respond to bandwidth constraints in real time. Thoughts on this?

Thanks for reviewing! Looking forward to feedback, suggestions, and further discussion. This PR is opened to propose and evaluate a bandwidth-capped merge scheduler design, and to gather feedback for further development.

Related to #14148

@nipunbatra8 nipunbatra8 changed the title [Draft] Add BandwidthCappedMergeScheduler: A Global Bandwidth-Limiting MergeScheduler [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap Jul 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant