Skip to content

Add VictoriaMetrics switch guide for TiUP cluster #20335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Jun 10, 2025
Merged
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
46f5976
Add VM switch guide for TiUP cluster
nolouch May 13, 2025
824a308
Merge remote-tracking branch 'origin/master' into add-vm-switch
nolouch May 13, 2025
11b259f
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
55982de
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
5c114de
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
06e2d88
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
65e7109
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
52dd4e0
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
6d73df2
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
fb25746
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
5c114db
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
9785d42
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
bd98e23
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
f929ea8
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
05c648f
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
cc6d63c
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
131cfc9
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
05e75c0
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
2643e8a
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
4bbc5aa
Update maintain-tidb-using-tiup.md
nolouch May 26, 2025
377fa97
Remove unnecessary copyable snippet
lilin90 May 28, 2025
93b53ab
Update format
lilin90 May 28, 2025
ec83583
Add a necessary space for body heading
lilin90 May 28, 2025
28a15e6
Update maintain-tidb-using-tiup.md
lilin90 May 28, 2025
0f5560a
Keep the order of content consistent with en
lilin90 May 28, 2025
8cc6729
Fix format
lilin90 May 30, 2025
d299d9c
Update maintain-tidb-using-tiup.md
lilin90 Jun 10, 2025
35847bf
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
5cf9f20
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
312e18b
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
f29b53c
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
7f9da39
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
f40272e
Update maintain-tidb-using-tiup.md
nolouch Jun 10, 2025
14c6b97
Update wording
lilin90 Jun 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
156 changes: 155 additions & 1 deletion maintain-tidb-using-tiup.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ summary: TiUP 是用于管理 TiDB 集群的工具,可以进行查看集群列

# TiUP 常见运维操作

本文介绍了使用 TiUP 运维 TiDB 集群的常见操作,包括查看集群列表、启动集群、查看集群状态、修改配置参数、关闭集群、销毁集群等
本文介绍使用 TiUP 运维 TiDB 集群的常见操作。

## 查看集群列表

Expand Down Expand Up @@ -272,3 +272,157 @@ tiup cluster clean ${cluster-name} --all --ignore-node 172.16.13.12
```bash
tiup cluster destroy ${cluster-name}
```

## 从 Prometheus 切换到 VictoriaMetrics

在大型集群中,Prometheus 在处理大量实例时可能会遇到性能瓶颈。从 TiUP 1.16.3 版本开始,TiUP 支持将指标监控组件从 Prometheus 切换为 VictoriaMetrics (VM),以提供更好的可扩展性、更高的性能和更低的资源消耗。

### 在新部署中启用 VictoriaMetrics

默认情况下,TiUP 使用 Prometheus 作为指标监控组件。如果要在新部署中使用 VictoriaMetrics 替代 Prometheus,可以在拓扑文件中进行如下配置:

```yaml
# 监控服务器配置
monitoring_servers:
# 监控服务器的 IP 地址
- host: ip_address
...
prom_remote_write_to_vm: true
enable_prom_agent_mode: true

# Grafana 服务器配置
grafana_servers:
# Grafana 服务器的 IP 地址
- host: ip_address
...
use_vm_as_datasource: true
```

### 将现有部署迁移到 VictoriaMetrics

你可以在不中断服务的情况下完成迁移。TiUP 会将现有的指标数据保留在 Prometheus 中,将新的指标数据写入 VictoriaMetrics。

#### 启用 Prometheus 向 VictoriaMetrics 的远程写入

1. 编辑集群配置:

```bash
tiup cluster edit-config ${cluster-name}
```

2. 在 `monitoring_servers` 配置下,添加 `prom_remote_write_to_vm: true`:

```yaml
monitoring_servers:
- host: ip_address
...
prom_remote_write_to_vm: true
```

3. 重新加载配置使其生效:

```bash
tiup cluster reload ${cluster-name} -R prometheus
```

#### 切换 Grafana 默认数据源至 VictoriaMetrics

1. 编辑集群配置:

```bash
tiup cluster edit-config ${cluster-name}
```

2. 在 `grafana_servers` 配置下,添加 `use_vm_as_datasource: true`:

```yaml
grafana_servers:
- host: ip_address
...
use_vm_as_datasource: true
```

3. 重新加载配置使其生效:

```bash
tiup cluster reload ${cluster-name} -R grafana
```

#### 查看切换前的历史指标(可选)

如果需要查看切换前生成的历史指标数据,执行以下步骤切换至 Grafana 的数据源:

1. 编辑集群配置:

```bash
tiup cluster edit-config ${cluster-name}
```

2. 注释掉 `grafana_servers` 下的 `use_vm_as_datasource`:

```yaml
grafana_servers:
- host: ip_address
...
# use_vm_as_datasource: true
```

3. 重新加载配置使其生效:

```bash
tiup cluster reload ${cluster-name} -R grafana
```

4. 若需切换回 VictoriaMetrics,请重复[切换 Grafana 默认数据源至 VictoriaMetrics](#切换-grafana-默认数据源至-victoriametrics) 的步骤。

### 清理旧指标和服务

在确认旧指标已过期的前提下,可按以下步骤移除相关冗余服务和文件,这不会影响集群的正常运行。

#### 将 Prometheus 设置为代理模式

1. 编辑集群配置:

```bash
tiup cluster edit-config ${cluster-name}
```

2. 设置代理模式,并确保相关参数已正确配置。

在 `monitoring_servers` 下设置 `enable_prom_agent_mode` 为 `true`,并确保 `prom_remote_write_to_vm` 和 `use_vm_as_datasource` 也正确设置:

```yaml
monitoring_servers:
- host: ip_address
...
prom_remote_write_to_vm: true
enable_prom_agent_mode: true

grafana_servers:
- host: ip_address
...
use_vm_as_datasource: true
```

3. 重新加载配置使其生效:

```bash
tiup cluster reload ${cluster-name} -R prometheus
```

#### 删除 Prometheus 旧数据目录

1. 在配置文件中找到监控服务器的数据目录路径 `data_dir`:

```yaml
monitoring_servers:
- host: ip_address
...
data_dir: "/tidb-data/prometheus-8249"
```

2. 删除数据目录:

```bash
rm -rf /tidb-data/prometheus-8249
```