From fdd1d68de5c49f6f9d33faf9f9da117035044814 Mon Sep 17 00:00:00 2001 From: Ti Chi Robot Date: Wed, 11 Dec 2024 15:57:21 +0800 Subject: [PATCH] pd: add patrol-region-worker-count (#19600) (#19652) --- dynamic-config.md | 3 ++- pd-configuration-file.md | 11 ++++++++++- pd-control.md | 10 ++++++++-- 3 files changed, 20 insertions(+), 4 deletions(-) diff --git a/dynamic-config.md b/dynamic-config.md index 0c19ecfb88a77..93878ef0e6187 100644 --- a/dynamic-config.md +++ b/dynamic-config.md @@ -280,7 +280,8 @@ The following PD configuration items can be modified dynamically: | `cluster-version` | The cluster version | | `schedule.max-merge-region-size` | Controls the size limit of `Region Merge` (in MiB) | | `schedule.max-merge-region-keys` | Specifies the maximum numbers of the `Region Merge` keys | -| `schedule.patrol-region-interval` | Determines the frequency at which `replicaChecker` checks the health state of a Region | +| `schedule.patrol-region-interval` | Determines the frequency at which the checker inspects the health state of a Region | +| `scheduler.patrol-region-worker-count` | Controls the number of concurrent operators created by the checker when inspecting the health state of a Region | | `schedule.split-merge-interval` | Determines the time interval of performing split and merge operations on the same Region | | `schedule.max-snapshot-count` | Determines the maximum number of snapshots that a single store can send or receive at the same time | | `schedule.max-pending-peer-count` | Determines the maximum number of pending peers in a single store | diff --git a/pd-configuration-file.md b/pd-configuration-file.md index fae926046fa51..8ab97d9b0385d 100644 --- a/pd-configuration-file.md +++ b/pd-configuration-file.md @@ -278,9 +278,18 @@ Configuration items related to scheduling ### `patrol-region-interval` -+ Controls the running frequency at which `replicaChecker` checks the health state of a Region. The smaller this value is, the faster `replicaChecker` runs. Normally, you do not need to adjust this parameter. ++ Controls the running frequency at which the checker inspects the health state of a Region. The smaller this value is, the faster the checker runs. Normally, you do not need to adjust this configuration. + Default value: `10ms` +### `patrol-region-worker-count` New in v8.5.0 + +> **Warning:** +> +> Setting this configuration item to a value greater than 1 enables concurrent checks. This is an experimental feature. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/tikv/pd/issues) on GitHub. + ++ Controls the number of concurrent [operators](/glossary.md#operator) created by the checker when inspecting the health state of a Region. Normally, you do not need to adjust this configuration. ++ Default value: `1` + ### `split-merge-interval` + Controls the time interval between the `split` and `merge` operations on the same Region. That means a newly split Region will not be merged for a while. diff --git a/pd-control.md b/pd-control.md index 32b5f1faaadc4..253a6bbb26019 100644 --- a/pd-control.md +++ b/pd-control.md @@ -232,10 +232,16 @@ Usage: config set region-score-formula-version v2 ``` -- `patrol-region-interval` controls the execution frequency that `replicaChecker` checks the health status of Regions. A shorter interval indicates a higher execution frequency. Generally, you do not need to adjust it. +- `patrol-region-interval` controls the execution frequency that the checker inspects the health status of Regions. A shorter interval indicates a higher execution frequency. Generally, you do not need to adjust it. ```bash - config set patrol-region-interval 10ms // Set the execution frequency of replicaChecker to 10ms + config set patrol-region-interval 10ms // Set the execution frequency of the checker to 10ms + ``` + +- `patrol-region-worker-count` controls the number of concurrent [operators](/glossary.md#operator) created by the checker when inspecting the health state of a Region. Normally, you do not need to adjust this configuration. Setting this configuration item to a value greater than 1 enables concurrent checks. Currently, this feature is experimental, and it is not recommended that you use it in the production environment. + + ```bash + config set patrol-region-worker-count 2 // Set the checker concurrency to 2 ``` - `max-store-down-time` controls the time that PD decides the disconnected store cannot be restored if exceeded. If PD does not receive heartbeats from a store within the specified period of time, PD adds replicas in other nodes.