Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: add conflict detect for grant hot leader scheduler #4903

Merged
merged 6 commits into from
Sep 25, 2024

Conversation

lhy1024
Copy link
Contributor

@lhy1024 lhy1024 commented May 7, 2022

Signed-off-by: lhy1024 [email protected]

What problem does this PR solve?

Issue Number: Ref #4399

Check List

Tests

  • No code

Release note

 None.

@ti-chi-bot
Copy link
Member

ti-chi-bot commented May 7, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • bufferflies

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added the release-note-none Denotes a PR that doesn't merit a release note. label May 7, 2022
@lhy1024 lhy1024 added the require-LGT1 Indicates that the PR requires an LGTM. label May 7, 2022
@lhy1024 lhy1024 requested a review from bufferflies May 7, 2022 09:25
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label May 7, 2022
@@ -329,7 +329,7 @@ func NewLabelSchedulerCommand() *cobra.Command {
func NewGrantHotRegionSchedulerCommand() *cobra.Command {
c := &cobra.Command{
Use: "grant-hot-region-scheduler <store_leader_id> <store_leader_id,store_peer_id_1,store_peer_id_2>",
Short: "add a scheduler to grant hot region to fixed stores",
Short: "add a scheduler to grant hot region to fixed stores. Note: balance-hot-region-scheduler must be paused.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do some actual checks to prevent these two schedulers from working at the same time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

@codecov
Copy link

codecov bot commented May 7, 2022

Codecov Report

Attention: Patch coverage is 74.19355% with 8 lines in your changes missing coverage. Please review.

Project coverage is 77.66%. Comparing base (f3e9d9a) to head (1f7991c).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4903      +/-   ##
==========================================
+ Coverage   77.64%   77.66%   +0.02%     
==========================================
  Files         474      474              
  Lines       62112    62140      +28     
==========================================
+ Hits        48228    48264      +36     
+ Misses      10333    10321      -12     
- Partials     3551     3555       +4     
Flag Coverage Δ
unittests 77.66% <74.19%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

@lhy1024
Copy link
Contributor Author

lhy1024 commented May 16, 2022

/hold

@ti-chi-bot ti-chi-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 16, 2022
@ti-chi-bot ti-chi-bot added the needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. label Apr 2, 2024
@lhy1024 lhy1024 changed the title scheduler: add comment for grant hot leader scheduler scheduler: add conflict detect for grant hot leader scheduler Sep 13, 2024
@ti-chi-bot ti-chi-bot bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Sep 13, 2024
@ti-chi-bot ti-chi-bot bot added dco-signoff: yes Indicates the PR's author has signed the dco. approved size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 13, 2024
@lhy1024 lhy1024 added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed status/LGT1 Indicates that a PR has LGTM 1. require-LGT1 Indicates that the PR requires an LGTM. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. approved labels Sep 13, 2024
@ti-chi-bot ti-chi-bot bot added the approved label Sep 13, 2024
@lhy1024 lhy1024 removed the approved label Sep 13, 2024
@ti-chi-bot ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 13, 2024
Signed-off-by: lhy1024 <[email protected]>
@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 13, 2024
@lhy1024
Copy link
Contributor Author

lhy1024 commented Sep 23, 2024

/retest

Run: addSchedulerForGrantHotRegionCommandFunc,
Use: "grant-hot-region-scheduler <store_leader_id> <store_leader_id,store_peer_id_1,store_peer_id_2>",
Short: `add a scheduler to grant hot region to fixed stores.
Note: If there is balance-hot-region-scheduler running, please remove it first, otherwise grant-hot-region-scheduler will not work.`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it ok if the hot scheduler is not work but exist, for example being pause status?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this pr, even though the hot-scheduler is paused, adding grant-hot-leader-scheduler is still not allowed.

If this is allowed, let's look at an example where I suspend the hot-scheduler for ten seconds and then add grant-hot-leader-scheduler, which is fine for ten seconds, but after ten seconds the hot-scheduler resumes scheduling and they will conflict again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, understand. BTW, the hot scheduler config still exist if I enable it again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the configuration still exists. If you think this pr is ok, I will cancel /hold.

@bufferflies
Copy link
Contributor

LGTM

@lhy1024
Copy link
Contributor Author

lhy1024 commented Sep 24, 2024

/unhold it still need an approve

@lhy1024 lhy1024 removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 24, 2024
@@ -488,12 +488,32 @@ func (suite *schedulerTestSuite) checkSchedulerConfig(cluster *pdTests.TestClust
})

// test grant hot region scheduler config
checkSchedulerCommand(re, cmd, pdAddr, []string{"-u", pdAddr, "scheduler", "add", "grant-hot-region-scheduler", "1", "1,2,3"}, map[string]bool{
// case 1: add grant-hot-region-scheduler when balance-hot-region-scheduler is running
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not belong to the scheduler config test.

Signed-off-by: lhy1024 <[email protected]>
return schedulers.([]string), nil
}

addr, ok := h.svr.GetServicePrimaryAddr(h.svr.Context(), constant.SchedulingServiceName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to query schedulers from the scheduling service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we need to check whether hot scheduler is disabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PD has the same configuration as the scheduling service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Use: "grant-hot-region-scheduler <store_leader_id> <store_leader_id,store_peer_id_1,store_peer_id_2>",
Short: "add a scheduler to grant hot region to fixed stores",
Run: addSchedulerForGrantHotRegionCommandFunc,
Use: "grant-hot-region-scheduler <store_leader_id> <store_leader_id,store_peer_id_1,store_peer_id_2>",
Copy link
Member

@rleungx rleungx Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, is it necessary to set store_leader_id again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I also think it is not necessary. I will modify it in another pr.

Signed-off-by: lhy1024 <[email protected]>
@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Sep 25, 2024
@lhy1024 lhy1024 removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Sep 25, 2024
Copy link
Contributor

ti-chi-bot bot commented Sep 25, 2024

@lhy1024: You cannot manually add or delete the reviewing state labels, only I and the tursted members have permission to do so.

In response to removing label named needs-1-more-lgtm.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Sep 25, 2024
@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Sep 25, 2024
Copy link
Contributor

ti-chi-bot bot commented Sep 25, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bufferflies, JmPotato, rleungx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [JmPotato,bufferflies,rleungx]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

ti-chi-bot bot commented Sep 25, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-09-25 08:10:14.74561145 +0000 UTC m=+1639884.486035389: ☑️ agreed by rleungx.
  • 2024-09-25 08:37:58.385691774 +0000 UTC m=+1641548.126115729: ☑️ agreed by JmPotato.

Copy link
Contributor

ti-chi-bot bot commented Sep 25, 2024

@lhy1024: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-integration-realcluster-test 1f7991c link false /test pull-integration-realcluster-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@ti-chi-bot ti-chi-bot bot merged commit 7907679 into tikv:master Sep 25, 2024
23 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved dco-signoff: yes Indicates the PR's author has signed the dco. lgtm release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants