-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler: add conflict detect for grant hot leader scheduler #4903
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
@@ -329,7 +329,7 @@ func NewLabelSchedulerCommand() *cobra.Command { | |||
func NewGrantHotRegionSchedulerCommand() *cobra.Command { | |||
c := &cobra.Command{ | |||
Use: "grant-hot-region-scheduler <store_leader_id> <store_leader_id,store_peer_id_1,store_peer_id_2>", | |||
Short: "add a scheduler to grant hot region to fixed stores", | |||
Short: "add a scheduler to grant hot region to fixed stores. Note: balance-hot-region-scheduler must be paused.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do some actual checks to prevent these two schedulers from working at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4903 +/- ##
==========================================
+ Coverage 77.64% 77.66% +0.02%
==========================================
Files 474 474
Lines 62112 62140 +28
==========================================
+ Hits 48228 48264 +36
+ Misses 10333 10321 -12
- Partials 3551 3555 +4
Flags with carried forward coverage won't be shown. Click here to find out more. |
/hold |
Signed-off-by: lhy1024 <[email protected]>
Signed-off-by: lhy1024 <[email protected]>
/retest |
Run: addSchedulerForGrantHotRegionCommandFunc, | ||
Use: "grant-hot-region-scheduler <store_leader_id> <store_leader_id,store_peer_id_1,store_peer_id_2>", | ||
Short: `add a scheduler to grant hot region to fixed stores. | ||
Note: If there is balance-hot-region-scheduler running, please remove it first, otherwise grant-hot-region-scheduler will not work.`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it ok if the hot scheduler is not work but exist, for example being pause status?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this pr, even though the hot-scheduler is paused, adding grant-hot-leader-scheduler is still not allowed.
If this is allowed, let's look at an example where I suspend the hot-scheduler for ten seconds and then add grant-hot-leader-scheduler, which is fine for ten seconds, but after ten seconds the hot-scheduler resumes scheduling and they will conflict again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, understand. BTW, the hot scheduler config still exist if I enable it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the configuration still exists. If you think this pr is ok, I will cancel /hold.
LGTM |
/unhold it still need an approve |
@@ -488,12 +488,32 @@ func (suite *schedulerTestSuite) checkSchedulerConfig(cluster *pdTests.TestClust | |||
}) | |||
|
|||
// test grant hot region scheduler config | |||
checkSchedulerCommand(re, cmd, pdAddr, []string{"-u", pdAddr, "scheduler", "add", "grant-hot-region-scheduler", "1", "1,2,3"}, map[string]bool{ | |||
// case 1: add grant-hot-region-scheduler when balance-hot-region-scheduler is running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not belong to the scheduler config test.
Signed-off-by: lhy1024 <[email protected]>
server/api/scheduler.go
Outdated
return schedulers.([]string), nil | ||
} | ||
|
||
addr, ok := h.svr.GetServicePrimaryAddr(h.svr.Context(), constant.SchedulingServiceName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to query schedulers from the scheduling service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we need to check whether hot scheduler is disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PD has the same configuration as the scheduling service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Use: "grant-hot-region-scheduler <store_leader_id> <store_leader_id,store_peer_id_1,store_peer_id_2>", | ||
Short: "add a scheduler to grant hot region to fixed stores", | ||
Run: addSchedulerForGrantHotRegionCommandFunc, | ||
Use: "grant-hot-region-scheduler <store_leader_id> <store_leader_id,store_peer_id_1,store_peer_id_2>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, is it necessary to set store_leader_id
again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I also think it is not necessary. I will modify it in another pr.
Signed-off-by: lhy1024 <[email protected]>
@lhy1024: You cannot manually add or delete the reviewing state labels, only I and the tursted members have permission to do so. In response to removing label named needs-1-more-lgtm. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
Signed-off-by: lhy1024 <[email protected]>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bufferflies, JmPotato, rleungx The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@lhy1024: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Signed-off-by: lhy1024 [email protected]
What problem does this PR solve?
Issue Number: Ref #4399
Check List
Tests
Release note