Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statistics: get region info via core cluster inside RegionStatistics #6804

Merged
merged 9 commits into from
Jul 17, 2023

Conversation

JmPotato
Copy link
Member

@JmPotato JmPotato commented Jul 13, 2023

What problem does this PR solve?

Issue Number: Close #6560.

What is changed and how does it work?

Instead of maintaining its own region info cache, use the core cluster to
fetch the region info inside `RegionStatistics` to make sure consistency.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test

Run pd-ctl region check miss-peer:

[
  {
      "id": 22,
      "start_key": "7480000000000000FF5C00000000000000F8",
      "end_key": "748000FFFFFFFFFFFFF900000000000000F8",
      "epoch": {
        "conf_ver": 1,
        "version": 56
      },
      "peers": [
        {
          "id": 23,
          "store_id": 1,
          "role_name": "Voter"
        }
      ],
      "leader": {
        "id": 23,
        "store_id": 1,
        "role_name": "Voter"
      },
      "cpu_usage": 0,
      "written_bytes": 204,
      "read_bytes": 0,
      "written_keys": 4,
      "read_keys": 0,
      "approximate_size": 1,
      "approximate_keys": 0
  }
]

Run pd-ctl region 22:

{
  "id": 22,
  "start_key": "7480000000000000FF5C00000000000000F8",
  "end_key": "748000FFFFFFFFFFFFF900000000000000F8",
  "epoch": {
    "conf_ver": 1,
    "version": 56
  },
  "peers": [
    {
      "id": 23,
      "store_id": 1,
      "role_name": "Voter"
    }
  ],
  "leader": {
    "id": 23,
    "store_id": 1,
    "role_name": "Voter"
  },
  "cpu_usage": 0,
  "written_bytes": 204,
  "read_bytes": 0,
  "written_keys": 4,
  "read_keys": 0,
  "approximate_size": 1,
  "approximate_keys": 0
}

Release note

None.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 13, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • bufferflies
  • rleungx

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-triage-completed release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-cherry-pick-release-5.3 Type: Need cherry pick to release-5.3 needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. and removed do-not-merge/needs-triage-completed labels Jul 13, 2023
for _, r := range r.stats[typ] {
res = append(res, r.RegionInfo.Clone())
for regionID := range r.stats[typ] {
res = append(res, r.core.GetRegion(regionID).Clone())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to check the region status that satisfy the required type?
for example , some region is down peer in the stats , but is healthy peer in th core.

Copy link
Member Author

@JmPotato JmPotato Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's unnecessary since (*RegionStatistics).Observe will be called right after the region info is updated in *core.RegionsInfo, therefore according to the current code, there will be no inconsistency.

@codecov
Copy link

codecov bot commented Jul 13, 2023

Codecov Report

Merging #6804 (510a9c8) into master (58eb48b) will increase coverage by 0.13%.
The diff coverage is 100.00%.

❗ Current head 510a9c8 differs from pull request most recent head b729f30. Consider uploading reports for the commit b729f30 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6804      +/-   ##
==========================================
+ Coverage   74.13%   74.26%   +0.13%     
==========================================
  Files         413      413              
  Lines       43413    43304     -109     
==========================================
- Hits        32183    32160      -23     
+ Misses       8354     8285      -69     
+ Partials     2876     2859      -17     
Flag Coverage Δ
unittests 74.26% <100.00%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

for _, r := range r.offlineStats[typ] {
res = append(res, r.Clone())
for regionID := range r.offlineStats[typ] {
res = append(res, r.core.GetRegion(regionID).Clone())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it affect the heartbeat?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that it will only be manually called by HTTP API and pd-ctl, I don't think it will affect the heartbeat too much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have some stores that are offline and each of them has many regions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about keeping the offline peer region recorded in the RegionStatistics as before to gain better performance in this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this part of the logic to a mixture of old and new. PTAL.

pkg/statistics/region_collection.go Outdated Show resolved Hide resolved
@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 13, 2023
@ti-chi-bot ti-chi-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 14, 2023
@ti-chi-bot ti-chi-bot bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 17, 2023
Signed-off-by: JmPotato <[email protected]>
@ti-chi-bot ti-chi-bot bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 17, 2023
@JmPotato
Copy link
Member Author

/merge

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 17, 2023

@JmPotato: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 17, 2023

This pull request has been accepted and is ready to merge.

Commit hash: b729f30

@ti-chi-bot ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 17, 2023
@ti-chi-bot ti-chi-bot bot merged commit 40eaa35 into tikv:master Jul 17, 2023
@JmPotato JmPotato deleted the fix_region_statis branch July 17, 2023 06:36
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-5.3: #6814.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-5.4: #6816.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request Jul 17, 2023
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.1: #6817.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #6818.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request Jul 17, 2023
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #6819.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request Jul 17, 2023
rleungx pushed a commit to rleungx/pd that referenced this pull request Dec 1, 2023
…ikv#6804)

close tikv#6560

Instead of maintaining its own region info cache, use the core cluster to
fetch the region info inside `RegionStatistics` to make sure consistency.

Signed-off-by: JmPotato <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-5.3 Type: Need cherry pick to release-5.3 needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Abnormal output from "get regions by state"
5 participants