-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: support backoff mechanism for memberLoop #6978
Conversation
Signed-off-by: husharp <[email protected]>
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Skipping CI for Draft Pull Request. |
9ad1517
to
f4541bf
Compare
f4541bf
to
2df1ffd
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## release-6.5 #6978 +/- ##
===============================================
- Coverage 75.75% 75.64% -0.12%
===============================================
Files 329 330 +1
Lines 33600 33627 +27
===============================================
- Hits 25453 25436 -17
- Misses 5988 6015 +27
- Partials 2159 2176 +17
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
2df1ffd
to
9cd68d6
Compare
9cd68d6
to
870664c
Compare
3970df8
to
850e3e9
Compare
Signed-off-by: husharp <[email protected]>
850e3e9
to
23318b6
Compare
c82d267
to
70f016d
Compare
Signed-off-by: husharp <[email protected]>
70f016d
to
6405f53
Compare
Signed-off-by: husharp <[email protected]>
} | ||
|
||
// Only used for test. | ||
var testBackOffExecuteFlag = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can it be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
want to make sure back off executed
Signed-off-by: husharp <[email protected]>
Signed-off-by: husharp <[email protected]>
mustExec([]string{"-u", pdAddr, "scheduler", "config", "balance-hot-region-scheduler"}, &conf1) | ||
// After upgrading, we should not use query. | ||
expected1["read-priorities"] = []interface{}{"query", "byte"} | ||
re.NotEqual(expected1, conf1) | ||
expected1["read-priorities"] = []interface{}{"key", "byte"} | ||
re.Equal(expected1, conf1) | ||
mustExec([]string{"-u", pdAddr, "scheduler", "config", "balance-hot-region-scheduler"}, &conf1) | ||
re.Equal(conf1["read-priorities"], []interface{}{"key", "byte"}) | ||
// cannot set qps as write-peer-priorities | ||
echo = mustExec([]string{"-u", pdAddr, "scheduler", "config", "balance-hot-region-scheduler", "set", "write-peer-priorities", "query,byte"}, nil) | ||
re.Contains(echo, "query is not allowed to be set in priorities for write-peer-priorities") | ||
mustExec([]string{"-u", pdAddr, "scheduler", "config", "balance-hot-region-scheduler"}, &conf1) | ||
re.Equal(expected1, conf1) | ||
re.Equal(conf1["write-peer-priorities"], []interface{}{"byte", "key"}) | ||
|
||
// test remove and add | ||
mustExec([]string{"-u", pdAddr, "scheduler", "remove", "balance-hot-region-scheduler"}, nil) | ||
mustExec([]string{"-u", pdAddr, "scheduler", "add", "balance-hot-region-scheduler"}, nil) | ||
re.Equal(expected1, conf1) | ||
echo = mustExec([]string{"-u", pdAddr, "scheduler", "remove", "balance-hot-region-scheduler"}, nil) | ||
re.Contains(echo, "Success") | ||
echo = mustExec([]string{"-u", pdAddr, "scheduler", "add", "balance-hot-region-scheduler"}, nil) | ||
re.Contains(echo, "Success") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cp #5847 to make TestScheduler stable
Signed-off-by: husharp <[email protected]>
39b2043
to
7fc588f
Compare
/merge |
@nolouch: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
This pull request has been accepted and is ready to merge. Commit hash: 7fc588f
|
Signed-off-by: husharp [email protected]<!--
Thank you for working on PD! Please read PD's CONTRIBUTING document BEFORE filing this PR.
PR Title Format:
-->
What problem does this PR solve?
Issue Number: ref #6556
What is changed and how does it work?
Check List
Tests
PR Summary
reconnectMemberLoop
callupdateMember
periodically. When callingScheduleCheckMemberChanged
channel, we need to wait for the goroutine to update members until ready or timeout.expo
function can be used to backoff to sleep when an error is encountered.Reproduce Step
enable fail point, like gRPC is throttling, cannot read from etcd.
curl -X PUT -d 'return(10)' http://tc-pd-1.tc-pd-peer.csn-simulator-big-cluster-vd62g.svc:2379/pd/api/v1/fail/github.com/tikv/pd/pkg/etcdutil/SlowEtcdKVGet
simulate pd lost leader
curl -X PUT -d 'return("2346857576170797299")' http://tc-pd-1.tc-pd-peer.csn-simulator-big-cluster-vd62g.svc:2379/pd/api/v1/fail/github.com/tikv/pd/server/exitCampaignLeader
Reproduce Result
Grpc request
GetMember
keeps high:TiKV side show
PR Effect
The Grpc
GetMember
call was reduced from 3.2k to 170, which is relative to the TiDB numbers and client requests for triaging checkLeader.For 20 * tidb 3 * PD 50 * TiKV
170 = (50 * 3 / 3 / 3[TiKV side] + 20 * 2 [TiDB side]) * 3[PD Num]
And more tests are necessary to ensure that no further issues arise.
Release note