10/30/2020

BFD flaps in scale environment

 In a lab scenario, says between 2 routers, there are hundreds of sub-interfaces and a BGP session with BFD on each subinterface. Then BFD flaps is seen. The scale information is as followed:

  • 256 subinterfaces
  • 256 ebgp session enabled with BFD
  • BFD timers are 50ms x 3

Addressing           Type                         Up        Init        Down    AdminDown
-------------------- -------------------- ------------- ----------- ----------- ---------
All                  All                     239 [0]       8 [0]       9 [0]        0 [0]
IPv4                 All                     239 [0]       8 [0]       9 [0]        0 [0]
    single hop       All                     239 [0]       8 [0]       9 [0]        0 [0]
                     normal                  239 [0]       8 [0]       9 [0]        0 [0]

From the above output, you can see 17 of 256 sessions are not up. And HW BFD is enabled (by default in Eos), but no helps. 

ghb289#show bfd hardware utilization
Chip Name          Number Of HW Sessions*    Maximum Number Of HW Sessions*
--------------- ---------------------------- ------------------------------
Jericho0                                0                               200
Jericho1                              128                               200
Jericho2                              128                               200

"show cpu counter queue" indicates high # of drop of CoppSystemBfd class, which means the receiving BFD packets exceeds the b/w limit of Copp.

ghb289#sh cpu counters queue |nz |grep -i bfd
CoppSystemBfd              Et16/2                5584344          390904080                  0                  0
CoppSystemBfd              Et51/2              783529888        57981211712            3985376          294917824

After increasing the shape/bandwidth of copp-system-bfd, there is bfd flaps anymore.
  Class-map: copp-system-bfd (match-any)
       shape : 25000 kbps
       bandwidth : 2500 kbps

No comments:

Post a Comment