Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BFD session is not starting from a bgp peer after restart frrouting #17751

Closed
2 tasks done
ne-vlezay80 opened this issue Jan 3, 2025 · 8 comments
Closed
2 tasks done

Comments

@ne-vlezay80
Copy link
Contributor

Description

bfd session is not start from per bgp peer

Version

FRRouting 10.2 (neigh) on Linux(6.1.0-28-amd64).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--prefix=/usr' '--localstatedir=/run/frr' '--sbindir=/usr/lib/frr' '--sysconfdir=/etc/frr' '--libdir=/usr/lib/frr' '--with-moduledir=/usr/lib/frr/modules' '--disable-dependency-tracking' '--enable-rpki' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'CC=gcc' 'CXX=g++' 'PYTHON=python3'

How to reproduce

  1. configure bfd per bgp peers
  2. restart frr
  3. run sh bfd peer

Expected behavior

Starting bfd session is configured from a bgp peers after frrouting restart.

Actual behavior

The bfd not starting after restart frrouting if configure from bgp peer

Additional context

config from bgp peer:

neighbor 172.30.255.24 remote-as 4200220005
 neighbor 172.30.255.24 bfd
 neighbor 172.30.255.24 update-source 172.30.255.25
 neighbor 172.30.255.24 timers 1 4

bfd peer:

BFD Peers:
        peer 172.30.255.24 local-address 172.30.255.25 vrf default interface qt-swep0
                ID: 1005410387
                Remote ID: 2310190463
                Active mode
                Status: up
                Uptime: 8 minute(s), 32 second(s)
                Diagnostics: ok
                Remote diagnostics: ok
                Peer Type: dynamic
                RTT min/avg/max: 0/0/0 usec
                Local timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms
                        Echo transmission interval: disabled
                Remote timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: 50ms

Restaring frr:

~ # Stopped watchfrr
Stopped ospfd
Stopped bgpd
Stopped staticd
Stopped bfdd
Stopped ospf6d
Stopped zebra
Stopped mgmtd
Starting watchfrr with command: '  /usr/lib/frr/watchfrr  -d  -F traditional   zebra mgmtd bgpd ospfd ospf6d staticd bfdd'
2025/01/03 04:06:26 WATCHFRR: [YDG3W-JND95] FD Limit set: 1048576 is stupidly large.  Is this what you intended?  Consider using --limit-fds also limiting size to 100000
Started watchfrr

[1]+  Done                       service frr restart

BFD peer is empty:

neigh# show bfd peer
BFD Peers:

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@ne-vlezay80 ne-vlezay80 added the triage Needs further investigation label Jan 3, 2025
@ne-vlezay80
Copy link
Contributor Author

Fix bfd bug script:

#!/bin/bash

bfd_bgp_peers=$(vtysh -c 'sh ru bgpd'|grep bfd|grep neighbor|awk '{print $2}')

vtysh_cmd+=("conf t\n")
vtysh_cmd+=("router bgp 4200220004\n")
for a in ${bfd_bgp_peers}; do
        if [[ -z ${a} ]]; then
                continue
        fi
        vtysh_cmd+=("no neighbor ${a} bfd\n")
        vtysh_cmd+=("neighbor ${a} bfd\n")
done

echo -ne ${vtysh_cmd[*]}|vtysh

@ne-vlezay80 ne-vlezay80 changed the title BFD is not starting from a bgp peer after restart frrouting BFD session is not starting from a bgp peer after restart frrouting Jan 3, 2025
@ne-vlezay80
Copy link
Contributor Author

If recreating network interface from bfd peer, bfd is goig to init state. Is bfd packets is not send on own host

@ton31337
Copy link
Member

ton31337 commented Jan 3, 2025

Could you show the logs?

debug bfd peer
debug bgp bfd

@ne-vlezay80
Copy link
Contributor Author

ne-vlezay80 commented Jan 3, 2025

Recreating

~ # tail /var/log/messages
2025-01-03T07:52:59.150409+00:00 neigh zebra[1128]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (110[91/111]) into the kernel
2025-01-03T07:52:59.150414+00:00 neigh zebra[1128]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (113[]) into the kernel
2025-01-03T07:52:59.150416+00:00 neigh zebra[1128]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (112[93/113]) into the kernel
2025-01-03T07:52:59.156598+00:00 neigh bfdd[1133]: [S5HNB-1XW3Z] ipv4-new: failed to bind port: Address not available
2025-01-03T07:53:00.100709+00:00 neigh bgpd[1138]: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Socket not connected
2025-01-03T07:53:00.200776+00:00 neigh bgpd[1138]: [VHWDJ-C5186][EC 33554453] 172.30.255.24: No local IPv6 address, and zebra does not support V6 routing with v4 nexthops, BGP routing for V6 will not work
2025-01-03T07:53:00.200841+00:00 neigh bgpd[1138]: [MVZKX-EG443][EC 33554452] bgp_process_packet: BGP OPEN receipt failed for peer: 172.30.255.24
2025-01-03T07:53:02.398580+00:00 neigh bgpd[1138]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 172.30.255.24 in vrf default
2025-01-03T07:53:02.398586+00:00 neigh bgpd[1138]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv6 Unicast from 172.30.255.24 in vrf default
2025-01-03T07:53:33.485364+00:00 neigh : -- MARK --

Restart:

2025-01-03T05:33:21.958492+00:00 neigh bfdd[1085]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2025-01-03T05:33:21.977279+00:00 neigh watchfrr[1067]: [QDG3Y-BY5TN] bfdd state -> up : connect succeeded
2025-01-03T05:53:29.903708+00:00 neigh zebra[1080]: [N5M5Y-J5BPG][EC 4043309121] Client 'bfd' (session id 0) encountered an error and is shutting down.
2025-01-03T05:53:29.904335+00:00 neigh zebra[1080]: [JPSA8-5KYEA] client 29 disconnected 0 bfd routes removed from the rib
2025-01-03T05:53:29.904356+00:00 neigh zebra[1080]: [S929C-NZR3N] client 29 disconnected 0 bfd nhgs removed from the rib
2025-01-03T05:53:29.925867+00:00 neigh bgpd[1090]: [MNSF9-KVB43] _bfd_sess_send: BFD session 172.30.255.25 -> 172.30.255.24 interface qt-swep0 VRF default(0) was not uninstalled
2025-01-03T05:53:33.136261+00:00 neigh watchfrr[424]: [ZCJ3S-SPH5S] bfdd state -> down : initial connection attempt failed
2025-01-03T05:53:33.349022+00:00 neigh bfdd[442]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2025-01-03T05:53:33.368379+00:00 neigh watchfrr[424]: [QDG3Y-BY5TN] bfdd state -> up : connect succeeded
2025-01-03T06:40:54.723122+00:00 neigh /etc/init.d/fixbfdbug[963]: start-stop-daemon: no matching processes found
2025-01-03T06:40:54.898743+00:00 neigh zebra[437]: [N5M5Y-J5BPG][EC 4043309121] Client 'bfd' (session id 0) encountered an error and is shutting down.
2025-01-03T06:40:54.900728+00:00 neigh zebra[437]: [JPSA8-5KYEA] client 29 disconnected 0 bfd routes removed from the rib
2025-01-03T06:40:54.900758+00:00 neigh zebra[437]: [S929C-NZR3N] client 29 disconnected 0 bfd nhgs removed from the rib
2025-01-03T06:40:54.934614+00:00 neigh bgpd[447]: [MNSF9-KVB43] _bfd_sess_send: BFD session 172.30.255.25 -> 172.30.255.24 interface qt-swep0 VRF default(0) was not uninstalled
2025-01-03T06:40:55.848078+00:00 neigh watchfrr[1115]: [ZCJ3S-SPH5S] bfdd state -> down : initial connection attempt failed
2025-01-03T06:40:56.264772+00:00 neigh bfdd[1133]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2025-01-03T06:40:56.321655+00:00 neigh watchfrr[1115]: [QDG3Y-BY5TN] bfdd state -> up : connect succeeded
2025-01-03T07:50:16.405974+00:00 neigh bfdd[1133]: [S5HNB-1XW3Z] ipv4-new: failed to bind port: Address not available
2025-01-03T07:50:16.413629+00:00 neigh bfdd[1133]: [S5HNB-1XW3Z] ipv4-new: failed to bind port: Address not available
2025-01-03T07:52:59.147224+00:00 neigh bfdd[1133]: [S5HNB-1XW3Z] ipv4-new: failed to bind port: Address not available
2025-01-03T07:52:59.156598+00:00 neigh bfdd[1133]: [S5HNB-1XW3Z] ipv4-new: failed to bind port: Address not available
2025-01-03T07:56:48.424368+00:00 neigh /etc/init.d/fixbfdbug[1414]: start-stop-daemon: no matching processes found
2025-01-03T07:56:48.566194+00:00 neigh bgpd[1138]: [MNSF9-KVB43] _bfd_sess_send: BFD session 172.30.255.25 -> 172.30.255.24 interface qt-swep0 VRF default(0) was not uninstalled
2025-01-03T07:56:48.584012+00:00 neigh zebra[1128]: [N5M5Y-J5BPG][EC 4043309121] Client 'bfd' (session id 0) encountered an error and is shutting down.
2025-01-03T07:56:48.588317+00:00 neigh zebra[1128]: [JPSA8-5KYEA] client 29 disconnected 0 bfd routes removed from the rib
2025-01-03T07:56:48.588349+00:00 neigh zebra[1128]: [S929C-NZR3N] client 29 disconnected 0 bfd nhgs removed from the rib
2025-01-03T07:56:49.530674+00:00 neigh watchfrr[1572]: [ZCJ3S-SPH5S] bfdd state -> down : initial connection attempt failed
2025-01-03T07:56:49.747229+00:00 neigh bfdd[1590]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2025-01-03T07:56:49.764947+00:00 neigh watchfrr[1572]: [QDG3Y-BY5TN] bfdd state -> up : connect succeeded
2025-01-03T07:58:32.043066+00:00 neigh /etc/init.d/fixbfdbug[1699]: start-stop-daemon: no matching processes found
2025-01-03T07:58:32.177015+00:00 neigh zebra[1585]: [N5M5Y-J5BPG][EC 4043309121] Client 'bfd' (session id 0) encountered an error and is shutting down.
2025-01-03T07:58:32.179758+00:00 neigh zebra[1585]: [JPSA8-5KYEA] client 29 disconnected 0 bfd routes removed from the rib
2025-01-03T07:58:32.179792+00:00 neigh zebra[1585]: [S929C-NZR3N] client 29 disconnected 0 bfd nhgs removed from the rib
2025-01-03T07:58:32.197491+00:00 neigh bgpd[1595]: [MNSF9-KVB43] _bfd_sess_send: BFD session 172.30.255.25 -> 172.30.255.24 interface qt-swep0 VRF default(0) was not uninstalled
2025-01-03T07:58:33.157461+00:00 neigh watchfrr[1857]: [ZCJ3S-SPH5S] bfdd state -> down : initial connection attempt failed
2025-01-03T07:58:33.372598+00:00 neigh bfdd[1875]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
2025-01-03T07:58:33.386196+00:00 neigh watchfrr[1857]: [QDG3Y-BY5TN] bfdd state -> up : connect succeeded

show debugging:

MGMT debugging status:

Zebra debugging status:

OSPF debugging status:

OSPF6 debugging status:

BGP debugging status:
  BGP BFD library debugging is on


Staticd debugging status

BFD debugging status:
  Peer events debugging is on.
  Zebra events debugging is on.
  Network layer debugging is on.

BFD BGP Logs:

~ # cat  /tmp/frr.log|grep bfd
2025/01/03 08:17:07 BGP: [Z4783-22SWG] zclient_bfd_command: Suppressing BFD peer reg/dereg messages
2025/01/03 08:17:07 BGP: [MNSF9-KVB43] _bfd_sess_send: BFD session 172.30.255.25 -> 172.30.255.24 interface qt-swep0 VRF default(0) was not uninstalled
2025/01/03 08:17:07 ZEBRA: [N5M5Y-J5BPG][EC 4043309121] Client 'bfd' (session id 0) encountered an error and is shutting down.
2025/01/03 08:17:07 ZEBRA: [JPSA8-5KYEA] client 29 disconnected 0 bfd routes removed from the rib
2025/01/03 08:17:07 ZEBRA: [S929C-NZR3N] client 29 disconnected 0 bfd nhgs removed from the rib
2025/01/03 08:17:08 BGP: [Q4BCV-6FHZ5] zclient_bfd_session_update: 0.0.0.0/32 -> 172.30.255.24/32 VRF default(0) (CPI bit no): Down
2025/01/03 08:17:08 BGP: [MKVHZ-7MS3V] bfd_session_status_update: neighbor 172.30.255.24 vrf default(0) bfd state Unknown -> Down
2025/01/03 08:17:08 BGP: [QFMSE-NPSNN] zclient_bfd_session_update:   sessions updated: 1
2025/01/03 08:17:08 BGP: [Q4BCV-6FHZ5] zclient_bfd_session_update: 0.0.0.0/32 -> 172.30.255.24/32 VRF default(0) (CPI bit no): Down
2025/01/03 08:17:08 BGP: [MKVHZ-7MS3V] bfd_session_status_update: neighbor 172.30.255.24 vrf default(0) bfd state Unknown -> Down
2025/01/03 08:17:08 BGP: [QFMSE-NPSNN] zclient_bfd_session_update:   sessions updated: 1
2025/01/03 08:17:12 BGP: [Q4BCV-6FHZ5] zclient_bfd_session_update: 0.0.0.0/32 -> 172.30.255.24/32 (interface qt-swep0) VRF default(0) (CPI bit no): Up
2025/01/03 08:17:12 BGP: [MKVHZ-7MS3V] bfd_session_status_update: neighbor 172.30.255.24 vrf default(0) bfd state Down -> Up
2025/01/03 08:17:12 BGP: [QFMSE-NPSNN] zclient_bfd_session_update:   sessions updated: 1
2025/01/03 08:17:12 BGP: [VQY9X-CQZKG] bgp_peer_bfd_update_source: address [0.0.0.0->172.30.255.24] to [172.30.255.25->172.30.255.24]
2025/01/03 08:17:12 BGP: [X8BD9-8RKN4] bgp_peer_bfd_update_source: interface none to qt-swep0
2025/01/03 08:17:12 BGP: [Q4BCV-6FHZ5] zclient_bfd_session_update: 172.30.255.25/32 -> 172.30.255.24/32 (interface qt-swep0) VRF default(0) (CPI bit no): Down
2025/01/03 08:17:12 BGP: [MKVHZ-7MS3V] bfd_session_status_update: neighbor 172.30.255.24 vrf default(0) bfd state Up -> Down
2025/01/03 08:17:12 BGP: [QFMSE-NPSNN] zclient_bfd_session_update:   sessions updated: 1
2025/01/03 08:17:13 BGP: [Q4BCV-6FHZ5] zclient_bfd_session_update: 172.30.255.25/32 -> 172.30.255.24/32 (interface qt-swep0) VRF default(0) (CPI bit no): Up
2025/01/03 08:17:13 BGP: [MKVHZ-7MS3V] bfd_session_status_update: neighbor 172.30.255.24 vrf default(0) bfd state Down -> Up
2025/01/03 08:17:13 BGP: [QFMSE-NPSNN] zclient_bfd_session_update:   sessions updated: 1
2025/01/03 08:27:15 BGP: [Z4783-22SWG] zclient_bfd_command: Suppressing BFD peer reg/dereg messages
2025/01/03 08:27:15 BGP: [MNSF9-KVB43] _bfd_sess_send: BFD session 172.30.255.25 -> 172.30.255.24 interface qt-swep0 VRF default(0) was not uninstalled
2025/01/03 08:27:15 ZEBRA: [JPSA8-5KYEA] client 29 disconnected 0 bfd routes removed from the rib
2025/01/03 08:27:15 ZEBRA: [S929C-NZR3N] client 29 disconnected 0 bfd nhgs removed from the rib
2025/01/03 08:27:17 BGP: [Q4BCV-6FHZ5] zclient_bfd_session_update: 0.0.0.0/32 -> 172.30.255.24/32 VRF default(0) (CPI bit no): Down
2025/01/03 08:27:17 BGP: [MKVHZ-7MS3V] bfd_session_status_update: neighbor 172.30.255.24 vrf default(0) bfd state Unknown -> Down
2025/01/03 08:27:17 BGP: [QFMSE-NPSNN] zclient_bfd_session_update:   sessions updated: 1

Config:

router bgp 4200220004
 neighbor 172.30.255.24 remote-as 4200220005
 neighbor 172.30.255.24 bfd
 neighbor 172.30.255.24 update-source qt-swep0
 neighbor 172.30.255.24 timers 1 4

@ton31337
Copy link
Member

ton31337 commented Jan 3, 2025

Is this happening only with 10.2, with 10.1 all good?

@ne-vlezay80
Copy link
Contributor Author

ne-vlezay80 commented Jan 3, 2025

Is this happening only with 10.2, with 10.1 all good?

10.2 is bad, 10.0 is good

@ton31337 ton31337 added bgp bfd and removed triage Needs further investigation labels Jan 3, 2025
@ton31337
Copy link
Member

ton31337 commented Jan 3, 2025

Found the issue.

@ton31337
Copy link
Member

ton31337 commented Jan 3, 2025

This backport #17754 should fix the issue.

@ton31337 ton31337 closed this as completed Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants