-
Notifications
You must be signed in to change notification settings - Fork 444
OVN DB 恢复
oilbeater edited this page Jun 27, 2022
·
9 revisions
Wiki 下的中文文档将不在维护,请访问我们最新的中文文档网站,获取最新的文档更新。
某个 ovn-central 无法正常启动,查看日志显示
* ovn-northd is not running
ovsdb-server: ovsdb error: error reading record 2739 from OVN_Northbound log: record 2739 advances commit index to 6308 but last log index is 6307
* Starting ovsdb-nb
该节点之前出现过时间不同步或者磁盘满的情况,可确认数据库文件受损。
根据提示是 OVN_Northbound 还是 OVN_Southbound 选择对应的 leader 节点进行操作
kubectl get ep -n kube-system
ovn-nb 10.0.128.61:6641 2d1h
ovn-sb 10.0.128.61:6642 2d1h
Exec 到对应 Pod 后查看当前数据库集群状态
root@VM-128-61-centos:/kube-ovn# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
9182
Name: OVN_Northbound
Cluster ID: e75f (e75fa340-49ed-45ab-990e-26cb865ebc85)
Server ID: 9182 (9182e8dd-b5b0-4dd8-8518-598cc1e374f3)
Address: tcp:[10.0.128.61]:6643
Status: cluster member
Role: leader
Term: 1454
Leader: self
Vote: self
Last Election started 1732603 ms ago, reason: timeout
Last Election won: 1732587 ms ago
Election timer: 1000
Log: [7332, 12512]
Entries not yet committed: 1
Entries not yet applied: 1
Connections: ->f080 <-f080 <-e631 ->e631
Disconnections: 1
Servers:
f080 (f080 at tcp:[10.0.129.139]:6643) next_index=12512 match_index=12510 last msg 63 ms ago
9182 (9182 at tcp:[10.0.128.61]:6643) (self) next_index=10394 match_index=12510
e631 (e631 at tcp:[10.0.131.173]:6643) next_index=12512 match_index=0
从集群中踢出状态异常节点
ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/kick OVN_Northbound e631
回到状态异常节点,删除对应的数据库文件
mv /etc/origin/ovn/ovnnb_db.db /tmp
删除对应的 ovn-central Pod 重启恢复
当 ovn-central 节点无法启动或数据库受损,无法保证多数节点正常,可通过下面的步骤来恢复 ovn-central 集群。
- 记录当前 ovn-central 副本数量,并停止 ovn-central 避免新的数据库变更
kubectl scale deployment -n kube-system ovn-central --replicas=0
- 选择 NODE_IPS 中排第一的节点恢复数据库文件,如果第一个节点数据库文件已损坏,从其他机器
/etc/origin/ovn
下复制文件到第一台机器,执行下列命令恢复数据库文件。
docker run -it -v /etc/origin/ovn:/etc/ovn kubeovn/kube-ovn:v1.10.0 bash
cd /etc/ovn/
ovsdb-tool cluster-to-standalone ovnnb_db_standalone.db ovnnb_db.db
ovsdb-tool cluster-to-standalone ovnsb_db_standalone.db ovnsb_db.db
- 退出容器,移除每个 ovn-central 节点上的数据库文件
mv /etc/origin/ovn/ovnnb_db.db /tmp
mv /etc/origin/ovn/ovnsb_db.db /tmp
- 恢复第一个节点的数据库文件
mv /etc/origin/ovn/ovnnb_db_standalone.db /etc/origin/ovn/ovnnb_db.db
mv /etc/origin/ovn/ovnsb_db_standalone.db /etc/origin/ovn/ovnsb_db.db
- 启动 ovn-central 容器
kubectl scale deployment -n kube-system ovn-central --replicas={之前副本数}