Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add predownload functionality to Pinot #14686

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

lnbest0707-uber
Copy link
Contributor

@lnbest0707-uber lnbest0707-uber commented Dec 19, 2024

feature
performance
#14592

Add a pre-download feature to enable "graceful" node replacement on Pinot. With this feature, during node replacement, admins would replace the old node (OD) to new node (NN) with the same instance id. Instead of bringing down the ON before starting up NN (which is required because there cannot be 2 nodes taking same helix id at the same time), admins could:

  1. Start NN in "pre-download" mode by adding one more parameter to StartServerCommand, like:
PropertiesConfiguration properties = CommonsConfigurationUtils.fromPath(<config_path>);
PredownloadScheduler predownloadScheduler = new PredownloadScheduler(properties);
predownloadScheduler.start();
  1. Waiting for NN "pre-download" complete with one of following conditions:
    - pre-download fully succeed
    - pre-download partially succeed but have retried enough times
    - pre-download failed in non-retriable mode
    - already waited for a max time period
  2. Bring down the ON
  3. Start NN in normal mode

With a successful pre-download, the Pinot node replacement could perform in the same way as a node restart. We could observe the downtime (referring to helix pending message values decrease to 0) reduce significantly.
(1h -> 10min in the attached case. Sometimes, it could be fully transparent if node load is not high.)
image

@codecov-commenter
Copy link

codecov-commenter commented Dec 19, 2024

Codecov Report

Attention: Patch coverage is 90.90909% with 1 line in your changes missing coverage. Please review.

Project coverage is 64.03%. Comparing base (59551e4) to head (73e77d2).
Report is 1490 commits behind head on master.

Files with missing lines Patch % Lines
...t/local/segment/store/SegmentLocalFSDirectory.java 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14686      +/-   ##
============================================
+ Coverage     61.75%   64.03%   +2.28%     
- Complexity      207     1608    +1401     
============================================
  Files          2436     2706     +270     
  Lines        133233   149167   +15934     
  Branches      20636    22861    +2225     
============================================
+ Hits          82274    95521   +13247     
- Misses        44911    46645    +1734     
- Partials       6048     7001     +953     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 64.02% <90.90%> (+2.31%) ⬆️
java-21 56.12% <90.90%> (-5.51%) ⬇️
skip-bytebuffers-false 64.03% <90.90%> (+2.28%) ⬆️
skip-bytebuffers-true 56.10% <90.90%> (+28.37%) ⬆️
temurin 64.03% <90.90%> (+2.28%) ⬆️
unittests 64.03% <90.90%> (+2.28%) ⬆️
unittests1 56.29% <90.90%> (+9.40%) ⬆️
unittests2 34.50% <90.90%> (+6.76%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lnbest0707-uber lnbest0707-uber force-pushed the upstream_fork_predownload branch from 5486191 to 73e77d2 Compare December 19, 2024 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants