-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resource_manager: record the max RU per second #7936
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #7936 +/- ##
==========================================
+ Coverage 73.48% 73.60% +0.11%
==========================================
Files 436 436
Lines 48376 48425 +49
==========================================
+ Hits 35550 35644 +94
+ Misses 9768 9721 -47
- Partials 3058 3060 +2
Flags with carried forward coverage won't be shown. Click here to find out more. |
Signed-off-by: nolouch <[email protected]>
Signed-off-by: nolouch <[email protected]>
PTAL @glorv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to update metrics.json
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest LGTM
if maxPerSecTrackers[name] == nil { | ||
maxPerSecTrackers[name] = newMaxPerSecCostTracker(name, defaultCollectIntervalSec) | ||
} | ||
maxPerSecTrackers[name].Observe(rruSum[name], wruSum[name]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if maxPerSecTrackers[name] == nil { | |
maxPerSecTrackers[name] = newMaxPerSecCostTracker(name, defaultCollectIntervalSec) | |
} | |
maxPerSecTrackers[name].Observe(rruSum[name], wruSum[name]) | |
if maxPerSecTrackers[name] == nil && (rruSum[name] + wruSum[name] > 0.0) { | |
maxPerSecTrackers[name] = newMaxPerSecCostTracker(name, defaultCollectIntervalSec) | |
} | |
if maxPerSecTrackers[name] != nil { | |
maxPerSecTrackers[name].Observe(rruSum[name], wruSum[name]) | |
} |
Maybe skip inactive groups here is better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be inactive for a period of time, and active for a period of time. To reflect this trend of change, the cost is acceptable
Signed-off-by: nolouch <[email protected]>
ptal @glorv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
@nolouch: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
This pull request has been accepted and is ready to merge. Commit hash: 23db662
|
In response to a cherrypick label: new pull request created to branch |
/run-cherry-picker |
In response to a cherrypick label: new pull request created to branch |
close tikv#7908 Signed-off-by: ti-chi-bot <[email protected]>
close #7908 resource_manager: record the max RU per second Signed-off-by: nolouch <[email protected]> Co-authored-by: nolouch <[email protected]> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
close #7908 resource_manager: record the max RU per second Signed-off-by: ti-chi-bot <[email protected]> Signed-off-by: nolouch <[email protected]> Co-authored-by: ShuNing <[email protected]> Co-authored-by: nolouch <[email protected]>
What problem does this PR solve?
Issue Number: Close #7908
When I only run workload A:
RU Avg is OK because the workload is stable.
But When I manually run the big query SQL like:
The monitoring was not very accurate(on-avg), which led me to mistakenly believe that I was far away from triggering RC control.
But from the slow query:
it cost 8000+ RU, and which let some query need wait in the RC queue
What is changed and how does it work?
Check List
Tests
Release note