BGP Update Wait-for-convergence
Purpose:
"basically prevents BGP from programming routes into hardware and from advertising routes until a convergence event is resolved". The benefit of this feature is to reduce CPU/hw programming churn during convergence event
Reference:
http://aspiringnetworker.blogspot.com/2015/08/bgp-in-arista-data-center_90.html
Use case:
For example on the spine routers, they have 128-way ECMP and multiple BGP sessions to neighbor routers. During convergence event, it could receive routing information piece by piece depending on how fast and the order of underlying physical interfaces become up. So this feature is to hold BGP prefix programming until the control plane is converged.
What's the convergence event:
- Router reload, bgp first time to start
- BGP clear
- Rib agent restart, etc
How does it work?
- When BGP enters to converge, it will exit in 1 of 3 conditions
- ALL BGP peers converge
- Convergence timeout - default 5 min
- Slow peer timeout - default 1 min 30 sec
- ALL BGP peers are converged if
- Neighbor established, and
- Receive a End-of-Rib or BGP KA if GR not enabled,
- The following cases will slow or stop the convergence process, so need 2 timers
- If 1 neighbor has big tables to convey
- If there is a dead neighbor, configured but never up
- If dynamic peer is configured
- Slow peer timeout
- default 1:30 after the first bgp neighbor up
- BGP convergence timer out:
- Default is 5 min
- Can be changed by "bgp convergence time xxx (sec)"
R2.15:01:44(config-router-bgp)#sh bgp convergence
BGP Convergence information for VRF: default
Configured convergence timeout: 00:05:00
Configured convergence slow peer timeout: 00:01:30
Convergence based update synchronization is enabled
Last Bgp convergence event 00:01:41 ago
Bgp convergence state : Timeout reached
Time taken to converge 00:01:30
First peer came up 00:01:41 ago
Pending Peers: 1
Total Peers: 3
Established Peers: 2
Disabled Peers: 0
Peers that did not converge before local bgp convergence:
IPv4 peers:
24.24.24.4 (Session : Active)
IPv6 peers:
None
While this feature makes a big sense on spine, but DO NOT enable it on TOR/Leaf or First Hop switches.
For example:
[ebgp]
/ \(uplink)
[mlagA]=====[mlagB]
\ /
[server pool]
Say in above a common mlag setup.
- mlagB reboots, then all uplink, downlink and peerlink all down
- The peerlink between mlagA and mlagB is up (fast before mlag reload delay).
- iBGP between mlagA and B is up, B receives routes from A
- But at this time, the ebgp via uplink still down, so these iBGP prefixes will be hold!!
- the downlink(mlag port-ch) are up, servers start to forward traffic, then all the traffic are dropped at floor.
router bgp 65500
update wait-for-convergence
update wait-install
Sample Topology:
[R1] <----1.1.1.0/24 prefix
|
| (ebgp)
|
[R2]--(ibgp)--- [R4] <--- dead neighbor
|
| (ebgp)
|
[R3]
1) we shut down the ebgp session between R2 and R4, which simulate a "dead" neighbor
! bgp prefix 1.1.1.0/24 in and best path is selected
R2.15:25:34(config-router-bgp)#sh ip bgp 1.1.1.0
BGP routing table information for VRF default
Router identifier 110.255.255.1, local AS number 2
BGP routing table entry for 1.1.1.0/24
Paths: 1 available
1
12.12.12.1 from 12.12.12.1 (130.255.255.100)
Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:01:55 ago, valid, external, best
Rx SAFI: Unicast
! 1.1.1.0/24 in RIB
R2.15:25:49(config-router-bgp)#sh ip route 1.1.1.0
B E 1.1.1.0/24 [200/0] via 12.12.12.1, Ethernet3/36/1
2) Now do a hard clear on R2
R2.15:26:58(config-router-bgp)#clear ip bgp *
! Clearing all IPv4 and IPv6 peering sessions
! bgp sessions up except the dead neigh - 24.24.24.4
R2.15:27:06(config-router-bgp)#bas
BGP summary information for VRF default
Router identifier 110.255.255.1, local AS number 2
Neighbor Status Codes: m - Under maintenance
Neighbor V AS MsgRcvd MsgSent InQ OutQ Up/Down State PfxRcd PfxAcc
12.12.12.1 4 1 82 84 0 0 00:00:09 Estab 1 1
23.23.23.3 4 3 73 91 0 0 00:00:09 Estab 0 0
24.24.24.4 4 2 69 72 0 0 00:26:52 Active
! bgp best path here
R2.15:27:20(config-router-bgp)#sh ip bgp 1.1.1.0
BGP routing table entry for 1.1.1.0/24
Paths: 1 available
1
12.12.12.1 from 12.12.12.1 (130.255.255.100)
Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:00:23 ago, valid, external, best
Rx SAFI: Unicast
! but route NOT in RIB
R2.15:27:34(config-router-bgp)#sh ip route 1.1.1.0
Gateway of last resort is not set
3) After slow peer timeout, the prefix shows up in RIB
R2.15:31:32(config-router-bgp)#sh ip route 1.1.1.0
Gateway of last resort is not set
R2.15:31:34(config-router-bgp)#show bgp conv
BGP Convergence information for VRF: default
Configured convergence timeout: 00:05:00
Configured convergence slow peer timeout: 00:01:30
Convergence based update synchronization is enabled
Last Bgp convergence event 00:01:29 ago
Bgp convergence state : Pending (Waiting for EORs/Keepalives from peer(s) and IGP convergence)
Convergence timer running, will expire in 00:03:31
Convergence timeout in use: 00:05:00
Convergence slow peer timeout in use: 00:01:30
First peer came up 00:01:29 ago
All the expected peers are up: no
All IGP protocols have converged: yes
Outstanding EORs: 0, Outstanding Keepalives: 0
Pending Peers: 1
Total Peers: 3
Established Peers: 2
Disabled Peers: 0
Peers that have not converged yet:
IPv4 peers:
24.24.24.4 (Session : Active)
IPv6 peers:
None
R2.15:31:35(config-router-bgp)#sh ip route 1.1.1.0
B E 1.1.1.0/24 [200/0] via 12.12.12.1, Ethernet3/36/1
R2.15:31:37(config-router-bgp)#show bgp conv
BGP Convergence information for VRF: default
Configured convergence timeout: 00:05:00
Configured convergence slow peer timeout: 00:01:30
Convergence based update synchronization is enabled
Last Bgp convergence event 00:01:34 ago
Bgp convergence state : Timeout reached
Time taken to converge 00:01:30
First peer came up 00:01:34 ago
Pending Peers: 1
Total Peers: 3
Established Peers: 2
Disabled Peers: 0
Peers that did not converge before local bgp convergence:
IPv4 peers:
24.24.24.4 (Session : Active)
IPv6 peers:
None
No comments:
Post a Comment