3/09/2018

Arista EOS - BGP, update wait-for-convergence

Feature Name: 
BGP Update Wait-for-convergence

Purpose: 
"basically prevents BGP from programming routes into hardware and from advertising routes until a convergence event is resolved". The benefit of this feature is to reduce CPU/hw programming churn during convergence event

Reference:
http://aspiringnetworker.blogspot.com/2015/08/bgp-in-arista-data-center_90.html

Use case: 
For example on the spine routers, they have 128-way ECMP and multiple BGP sessions to neighbor routers. During convergence event, it could receive routing information piece by piece depending on how fast and the order of underlying physical interfaces become up. So this feature is to hold BGP prefix programming until the control plane is converged. 

What's the convergence event:
  • Router reload, bgp first time to start
  • BGP clear
  • Rib agent restart, etc
How does it work?
  • When BGP enters to converge, it will exit in 1 of 3 conditions
    • ALL BGP peers converge
    • Convergence timeout - default 5 min
    • Slow peer timeout - default 1 min 30 sec
  • ALL BGP peers are converged if 
    • Neighbor established, and
    • Receive a End-of-Rib or BGP KA if GR not enabled, 
    • The following cases will slow or stop the convergence process, so need 2 timers
      • If 1 neighbor has big tables to convey 
      • If there is a dead neighbor, configured but never up
      • If dynamic peer is configured
  • Slow peer timeout
    • default 1:30 after the first bgp neighbor up
  • BGP convergence timer out:
    • Default is 5 min
    • Can be changed by "bgp convergence time xxx (sec)"
R2.15:01:44(config-router-bgp)#sh bgp convergence
BGP Convergence information for VRF: default
Configured convergence timeout: 00:05:00
Configured convergence slow peer timeout: 00:01:30
Convergence based update synchronization is enabled
Last Bgp convergence event 00:01:41 ago
Bgp convergence state : Timeout reached
   Time taken to converge 00:01:30
   First peer came up 00:01:41 ago
   Pending Peers:          1
       Total Peers:        3
       Established Peers:  2
       Disabled Peers:     0
   Peers that did not converge before local bgp convergence:
       IPv4 peers:
           24.24.24.4            (Session : Active)
       IPv6 peers:
           None


Limitation:
While this feature makes a big sense on spine, but DO NOT enable it on TOR/Leaf or First Hop switches. 

For example:
       [ebgp]
       /     \(uplink)
 [mlagA]=====[mlagB]
       \     /
     [server pool]


Say in above a common mlag setup. 
  1. mlagB reboots, then all uplink, downlink and peerlink all down 
  2. The peerlink between mlagA and mlagB is up (fast before mlag reload delay). 
  3. iBGP between mlagA and B is up, B receives routes from A
  4. But at this time, the ebgp via uplink still down, so these iBGP prefixes will be hold!!
  5. the downlink(mlag port-ch) are up, servers start to forward traffic, then all the traffic are dropped at floor. 
Sample configuration:
router bgp 65500
   update wait-for-convergence
   update wait-install

Sample Topology:

[R1] <----1.1.1.0/24 prefix
 |
 | (ebgp)
 |
[R2]--(ibgp)--- [R4] <--- dead neighbor
 |
 | (ebgp)
 |
[R3]

1) we shut down the ebgp session between R2 and R4, which simulate a "dead" neighbor

! bgp prefix 1.1.1.0/24 in and best path is selected
R2.15:25:34(config-router-bgp)#sh ip bgp 1.1.1.0
BGP routing table information for VRF default
Router identifier 110.255.255.1, local AS number 2
BGP routing table entry for 1.1.1.0/24
 Paths: 1 available
  1
    12.12.12.1 from 12.12.12.1 (130.255.255.100)
      Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:01:55 ago, valid, external, best
      Rx SAFI: Unicast

! 1.1.1.0/24 in RIB
R2.15:25:49(config-router-bgp)#sh ip route 1.1.1.0
 B E    1.1.1.0/24 [200/0] via 12.12.12.1, Ethernet3/36/1

2) Now do a hard clear on R2

R2.15:26:58(config-router-bgp)#clear ip bgp *
! Clearing all IPv4 and IPv6 peering sessions

! bgp sessions up except the dead neigh - 24.24.24.4
R2.15:27:06(config-router-bgp)#bas
BGP summary information for VRF default
Router identifier 110.255.255.1, local AS number 2
Neighbor Status Codes: m - Under maintenance
  Neighbor         V  AS           MsgRcvd   MsgSent  InQ OutQ  Up/Down State  PfxRcd PfxAcc
  12.12.12.1       4  1                 82        84    0    0 00:00:09 Estab  1      1
  23.23.23.3       4  3                 73        91    0    0 00:00:09 Estab  0      0
  24.24.24.4       4  2                 69        72    0    0 00:26:52 Active

! bgp best path here
R2.15:27:20(config-router-bgp)#sh ip bgp 1.1.1.0
BGP routing table entry for 1.1.1.0/24
 Paths: 1 available
  1
    12.12.12.1 from 12.12.12.1 (130.255.255.100)
      Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:00:23 ago, valid, external, best
      Rx SAFI: Unicast

! but route NOT in RIB
R2.15:27:34(config-router-bgp)#sh ip route 1.1.1.0
Gateway of last resort is not set

3) After slow peer timeout, the prefix shows up in RIB

R2.15:31:32(config-router-bgp)#sh ip route 1.1.1.0
Gateway of last resort is not set

R2.15:31:34(config-router-bgp)#show bgp conv
BGP Convergence information for VRF: default
Configured convergence timeout: 00:05:00
Configured convergence slow peer timeout: 00:01:30
Convergence based update synchronization is enabled
Last Bgp convergence event 00:01:29 ago
Bgp convergence state : Pending (Waiting for EORs/Keepalives from peer(s) and IGP convergence)
   Convergence timer running, will expire in 00:03:31
   Convergence timeout in use: 00:05:00
   Convergence slow peer timeout in use: 00:01:30
   First peer came up 00:01:29 ago
   All the expected peers are up: no
   All IGP protocols have converged: yes
   Outstanding EORs: 0, Outstanding Keepalives: 0
   Pending Peers:          1
       Total Peers:        3
       Established Peers:  2
       Disabled Peers:     0
   Peers that have not converged yet:
       IPv4 peers:
           24.24.24.4            (Session : Active)
       IPv6 peers:
           None

R2.15:31:35(config-router-bgp)#sh ip route 1.1.1.0
 B E    1.1.1.0/24 [200/0] via 12.12.12.1, Ethernet3/36/1

R2.15:31:37(config-router-bgp)#show bgp conv
BGP Convergence information for VRF: default
Configured convergence timeout: 00:05:00
Configured convergence slow peer timeout: 00:01:30
Convergence based update synchronization is enabled
Last Bgp convergence event 00:01:34 ago
Bgp convergence state : Timeout reached
   Time taken to converge 00:01:30
   First peer came up 00:01:34 ago
   Pending Peers:          1
       Total Peers:        3
       Established Peers:  2
       Disabled Peers:     0
   Peers that did not converge before local bgp convergence:
       IPv4 peers:
           24.24.24.4            (Session : Active)
       IPv6 peers:
           None

No comments:

Post a Comment