Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BMP Adj-RIB-Out (RFC8671), ECMP Support and Loc-RIB missing features #14847

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

mxyns
Copy link
Contributor

@mxyns mxyns commented Nov 21, 2023

Global:

  • added peer type for all monitored ribs
  • added peer distinguisher for all monitored ribs
  • include mpls labels (when possible, no adj-in-pre)

Add support for Adj-RIB-Out monitoring in BMP:

  • pre- and post-policy
  • config knobs
  • sync
  • incremental updates
  • stats (when possible)
  • no timestamp (currently not recorded in adj-out)

Add ECMP support in BMP:

  • include addpath id in bmp in adj-in pre/post, loc-rib and adj-out pre/post
  • monitoring on all ecmp selected path (BGP_PATH_SELECTED | BGP_PATH_MULTIPATH)

Add Loc-RIB missing features:

  • peer up/down on vrf state update
  • missing stats

Missing:

  • ?

@frrbot frrbot bot added the bgp label Nov 21, 2023
@frrbot frrbot bot added the tests Topotests, make check, etc label Nov 24, 2023
@github-actions github-actions bot added rebase PR needs rebase labels Nov 24, 2023
@mxyns mxyns force-pushed the pr-bmp-riboutmon branch 4 times, most recently from 98a978c to c4572c5 Compare November 27, 2023 16:46
@frrbot frrbot bot added the documentation label Nov 28, 2023
@mxyns mxyns force-pushed the pr-bmp-riboutmon branch 5 times, most recently from 7ab2e3a to 4234476 Compare November 28, 2023 17:31
@mxyns
Copy link
Contributor Author

mxyns commented Nov 28, 2023

ci:rerun

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@mxyns mxyns force-pushed the pr-bmp-riboutmon branch 2 times, most recently from 2597d3f to 4e8dca8 Compare November 29, 2023 10:17
@mxyns mxyns force-pushed the pr-bmp-riboutmon branch 3 times, most recently from 72ca5f7 to c0c8458 Compare December 1, 2023 15:35
@mxyns mxyns marked this pull request as ready for review December 2, 2023 10:58
@mxyns mxyns changed the title (WIP) BMP Adj-RIB-Out (RFC8671), ECMP Support and Loc-RIB missing features BMP Adj-RIB-Out (RFC8671), ECMP Support and Loc-RIB missing features Dec 2, 2023
@riw777 riw777 self-requested a review December 5, 2023 16:46
@ton31337 ton31337 self-requested a review December 5, 2023 16:52
@riw777
Copy link
Member

riw777 commented Aug 14, 2024

I guess we need to fix the conflicts now before we can get back to the failures ... rebasing might help with the ci failures anyway

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@ton31337
Copy link
Member

@mxyns could you rebase and force push?

bgpd: changed bmp config to include bmp monitor rib-out
bgpd: added rib-out pre-policy & post-policy monitoring
bgpd: bmp sync for rib-out pre and post
bgpd: added missing out-pre hook bump bmp sessions on bmp reconfigure to run sync immediately
lib, bgpd: added pullwr_timeout, added bmp startup-delay command
bgpd: bmp added bmp session state to bmp show
doc: bmp updated documentation with new commands
bgpd: bmp add rib-out post-policy stats
bgpd: bmp add adj-rib-in stats
bgpd: bmp add loc-rib stats

bgpd: bmp multipath support for adj-rib-in pre-policy
bgpd: bmp multipath support for adj-rib-out post-policy
bgpd: bmp multipath support for adj-rib-in post-policy
bgpd: bmp multipath support for loc-rib
bgpd: bmp multipath support for rib-out-pre
bgpd: bmp add multipath syncing, fix startup-delay cmd
bgpd: bmp simplify bpi locking system, add show bmp locked
bgpd: remove locked path in bqe, fix withdraw flag, fix lbpi hash & cmp
bgpd: no addpath id in mpls vpn case
bgpd: peer types and peer distinguisher in all bmp messages

Signed-off-by: Maxence Younsi <[email protected]>
@github-actions github-actions bot removed the conflicts label Nov 2, 2024
@mxyns mxyns force-pushed the pr-bmp-riboutmon branch 3 times, most recently from 77b979d to 5b95fbb Compare November 3, 2024 10:42
@frrbot frrbot bot added the bugfix label Nov 3, 2024
@mxyns mxyns force-pushed the pr-bmp-riboutmon branch 2 times, most recently from 9b939ce to 16a4d73 Compare November 8, 2024 23:50
mxyns and others added 9 commits November 9, 2024 05:39
add multipath test and rename ribs
add rd statement to vrf configuration
now bmp cannot export information about a VRF without the RD being set (it is required for RD Instance Peer to have a Peer Distinguisher)

change expected output for bgp_bmp tests
also add dashes in ADJ_[IN|OUT]_[PRE|POST]_POLICY constants to have spaceless file names

add add-path to nlri parsing
add bmpserver exception printing
add 3rd peer in topotest for ecmp

Signed-off-by: Maxence Younsi <[email protected]>
fixed nits & styling issues
fixed bad conditions in bmp_bpi_(un)lock
added bgp_peer_get_send_holdtime
added bgp_peer_get_local_as

Signed-off-by: Maxence Younsi <[email protected]>
Signed-off-by: Maxence Younsi <[email protected]>
check for null stream in bmp_send_peerup_vrf (which can happen when RD is not found)

use bmp_send_all_safe instead of bmp_send_all in bmp_vrf_itf_state_changed

Signed-off-by: Maxence Younsi <[email protected]>
Signed-off-by: Maxence Younsi <[email protected]>
Signed-off-by: Maxence Younsi <[email protected]>
@mxyns
Copy link
Contributor Author

mxyns commented Nov 11, 2024

hi @riw777 @ton31337
i recently had a bit of time to rebase, fix the tests and issues. the only test failing now is unrelated apparently (missing some python packet on cleanup).
there is commit (9d795cb) in this PR that adds a safe guard to bgp_mpath_diff_insert but I believe there is an underlying issue that I put in an issue: #17377 so i can remove that commit if the issue is addressed there.
I will apply part of the styling from frrbot now. I think after that this should be good to merge, right?

Signed-off-by: Maxence Younsi <[email protected]>
@riw777
Copy link
Member

riw777 commented Nov 12, 2024

I don't think the lint issues are a problem ... we need to figure out the ci though. :-(

@mxyns
Copy link
Contributor Author

mxyns commented Nov 13, 2024

@riw777
ok for the lints

for the CI, i sadly can't fix it myself, no test runs on my machine using ASAN and everything else passes on my machine. do you know anybody that could have a look into it?

the bmp test fails only in the address sanitizer part not the normal one so maybe it is waiting time that is too low? there's also a leak found by asan which is related to "rfapiBiStartWithdrawTimer", not bmp so idk what to do?

bmp asan leak found
=================================================================
==28602==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7f1cff6b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x7f1cff22f096 in qcalloc lib/memory.c:106
    #2 0x55fa6e7980c2 in rfapiBiStartWithdrawTimer bgpd/rfapi/rfapi_import.c:2753
    #3 0x55fa6e79e437 in rfapiProcessPeerDownRt bgpd/rfapi/rfapi_import.c:4070
    #4 0x55fa6e79e6fb in rfapiProcessPeerDown bgpd/rfapi/rfapi_import.c:4125
    #5 0x55fa6e64b331 in bgp_clear_route_all bgpd/bgp_route.c:6267
    #6 0x55fa6e5bdcfb in bgp_fsm_change_status bgpd/bgp_fsm.c:1252
    #7 0x55fa6e7505e6 in peer_delete bgpd/bgpd.c:2777
    #8 0x55fa6e756116 in bgp_delete bgpd/bgpd.c:4142
    #9 0x55fa6e510f5e in bgp_exit bgpd/bgp_main.c:204
    #10 0x55fa6e510f5e in sigint bgpd/bgp_main.c:162
    #11 0x7f1cff2b01d4 in frr_sigevent_process lib/sigevent.c:117
    #12 0x7f1cff2dbc42 in event_fetch lib/event.c:1767
    #13 0x7f1cff20ffe7 in frr_run lib/libfrr.c:1231
    #14 0x55fa6e5128f5 in main bgpd/bgp_main.c:555
    #15 0x7f1cfed0c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Direct leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7f1cff6b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x7f1cff22f096 in qcalloc lib/memory.c:106
    #2 0x55fa6e7980c2 in rfapiBiStartWithdrawTimer bgpd/rfapi/rfapi_import.c:2753
    #3 0x55fa6e79e437 in rfapiProcessPeerDownRt bgpd/rfapi/rfapi_import.c:4070
    #4 0x55fa6e79e6d2 in rfapiProcessPeerDown bgpd/rfapi/rfapi_import.c:4124
    #5 0x55fa6e64b331 in bgp_clear_route_all bgpd/bgp_route.c:6267
    #6 0x55fa6e5bdcfb in bgp_fsm_change_status bgpd/bgp_fsm.c:1252
    #7 0x55fa6e7505e6 in peer_delete bgpd/bgpd.c:2777
    #8 0x55fa6e756116 in bgp_delete bgpd/bgpd.c:4142
    #9 0x55fa6e510f5e in bgp_exit bgpd/bgp_main.c:204
    #10 0x55fa6e510f5e in sigint bgpd/bgp_main.c:162
    #11 0x7f1cff2b01d4 in frr_sigevent_process lib/sigevent.c:117
    #12 0x7f1cff2dbc42 in event_fetch lib/event.c:1767
    #13 0x7f1cff20ffe7 in frr_run lib/libfrr.c:1231
    #14 0x55fa6e5128f5 in main bgpd/bgp_main.c:555
    #15 0x7f1cfed0c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 64 byte(s) leaked in 2 allocation(s).

the other fail is also rfapi related but this time it's a cleanup test that fails. there's no error log that i found in this test and it only fails on a platform i can't test on.

****************************************************************************** Test Target Summary Pass Fail ****************************************************************************** FILE: scripts/add_routes.py 1 r1 Opened RFAPI 1 0 2 r1 Clean query 1 0 3 r1 Local registration +0.01 secs 1 0 4 r1 Query self 1 0 5 r3 Opened RFAPI 1 0 6 r3 Local registration +0.02 secs 1 0 7 r3 Self excluded 1 0 8 r3 Opened query only RFAPI 1 0 9 r3 See local 1 0 10 r4 Opened RFAPI 1 0 11 r4 Local registration +0.01 secs 1 0 12 r4 Query self 1 0 13 r4 Local registration +0.01 secs 1 0 14 r4 Query self MP 1 0 FILE: scripts/adjacencies.py 15 r1 PE->P2 (loopback) ping +14.65 secs 1 0 16 r3 PE->P2 (loopback) ping +0.00 secs 1 0 17 r4 PE->P2 (loopback) ping +0.01 secs 1 0 18 r2 Core adjacencies up +0.02 secs 1 0 19 r1 All adjacencies up +0.02 secs 1 0 20 r3 All adjacencies up +0.02 secs 1 0 21 r4 All adjacencies up +0.02 secs 1 0 22 r1 PE->PE3 (loopback) ping +0.00 secs 1 0 23 r1 PE->PE4 (loopback) ping +0.00 secs 1 0 FILE: scripts/check_routes.py 24 r1 See all registrations +1.06 secs 1 0 25 r3 See all registrations +0.01 secs 1 0 26 r4 See all registrations +0.02 secs 1 0 27 r1 VPN SAFI okay 1 0 28 r2 VPN SAFI okay 1 0 29 r3 VPN SAFI okay 1 0 30 r4 VPN SAFI okay 1 0 31 r1 Query R2s info 1 0 32 r1 Query R4s info 1 0 33 r3 Query R1s+R4s info 1 0 34 r3 Query R4s info 1 0 35 r4 Query R1s+R4s info 1 0 36 r4 Query R2s info 1 0 FILE: scripts/check_close.py 37 r1 Opened RFAPI 1 0 38 r1 Local registration +0.01 secs 1 0 39 r3 See registration +0.53 secs 1 0 40 r4 See registration +0.02 secs 1 0 41 r1 Closed RFAPI 1 0 42 r1 See cleanup +0.02 secs 1 0 43 r3 See cleanup +0.54 secs 1 0 44 r4 See cleanup +0.02 secs 1 0 45 r1 Out of holddown +0.02 secs 1 0 46 r3 Out of holddown +0.01 secs 1 0 47 r4 Out of holddown +0.01 secs 1 0 FILE: scripts/check_timeout.py 48 r1 Holddown factor not set -- skipping test 1 0 FILE: scripts/cleanup_all.py 49 r1 Local registration removed +0.02 secs 1 0 50 r1 Closed RFAPI 1 0 51 r3 Local registration removed +0.01 secs 1 0 52 r3 Closed RFAPI 1 0 53 r4 Local registration removed +0.02 secs 1 0 54 r4 Cleared NVEs 1 0 55 r1 All registrations cleared +0.01 secs 1 0 56 r3 All registrations cleared +0.01 secs 1 0 57 r4 All registrations cleared +0.01 secs 1 0 58 r1 VPN SAFI clear 0 1 59 r2 VPN SAFI clear 1 0 60 r3 VPN SAFI clear 1 0 61 r4 VPN SAFI clear 1 0 62 r1 No holddowns +0.01 secs 1 0 63 r3 No holddowns +0.01 secs 1 0 64 r4 No holddowns +0.01 secs 1 0 ****************************************************************************** Total 64 63 1 ******************************************************************************

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants