Skip to content

Conversation

@tarun11Mavani
Copy link
Contributor

@tarun11Mavani tarun11Mavani commented Oct 29, 2025

This PR introduces a new configurable property minNumSegmentsPerTask for the UpsertCompactMerge task, providing users with fine-grained control over segment merging behavior.

What Changed

  • Added new configuration property: minNumSegmentsPerTask with a default value of 2
  • Enhanced task filtering logic: Tasks now only execute when the minimum segment threshold is met
  • Maintained backward compatibility: Default behavior remains unchanged (minimum 2 segments)

Key Benefits

1. Flexible Task Behavior

  • Single-segment processing: Set minNumSegmentsPerTask=1 to enable UpsertCompactMerge to function as a replacement for UpsertCompaction task
  • Resource optimization: Set higher values (e.g., 5+) to prevent task execution unless sufficient segments are available for merging

Configuration

# Example table configuration
{
  "tableName": "myTable",
  "taskTypeConfigsMap": {
    "UpsertCompactMergeTask": {
       ...,
      "minNumSegmentsPerTask": "3"  // Only merge when 3+ segments available
    }
  }
}

@yashmayya yashmayya added upsert minion Configuration Config changes (addition/deletion/change in behavior) labels Oct 29, 2025
@tarun11Mavani tarun11Mavani force-pushed the feature-MIN_NUM_SEGMENTS_PER_TASK_KEY branch from 44c34f6 to a648945 Compare October 31, 2025 05:55
@codecov-commenter
Copy link

codecov-commenter commented Oct 31, 2025

Codecov Report

❌ Patch coverage is 38.88889% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.19%. Comparing base (ceb883a) to head (b383390).

Files with missing lines Patch % Lines
...tcompactmerge/UpsertCompactMergeTaskGenerator.java 38.88% 11 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17104      +/-   ##
============================================
- Coverage     63.19%   63.19%   -0.01%     
  Complexity     1428     1428              
============================================
  Files          3104     3104              
  Lines        183487   183495       +8     
  Branches      28126    28126              
============================================
+ Hits         115953   115954       +1     
- Misses        58575    58578       +3     
- Partials       8959     8963       +4     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.14% <38.88%> (-0.02%) ⬇️
java-21 63.16% <38.88%> (+0.01%) ⬆️
temurin 63.19% <38.88%> (-0.01%) ⬇️
unittests 63.18% <38.88%> (-0.01%) ⬇️
unittests1 56.17% <ø> (-0.03%) ⬇️
unittests2 33.59% <38.88%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@tarun11Mavani tarun11Mavani force-pushed the feature-MIN_NUM_SEGMENTS_PER_TASK_KEY branch from a648945 to b383390 Compare October 31, 2025 16:36
@tarun11Mavani tarun11Mavani marked this pull request as ready for review October 31, 2025 16:41
/**
* default minimum number of segments to process in a single task.
* Keeping this default to 2 means that we won't run this task if there is only one segment which can be merged.
* If this is set to 1, this task can act as UpsertCompact task as well.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a check in config-validator that if this is set to 1, UpsertCompactionTask config should not be present to avoid in-deterministic behaviour of newly created compacted segments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Configuration Config changes (addition/deletion/change in behavior) minion upsert

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants