- What is the MLAG reload-delay and why it is needed
- The difference of 2 different timers
- How to tune the timer values
After an MLAG peer boot up, all its ports are placed in err-disabled state (except the peer-link) with a reason of "mlag-issu". During the reload-delay, the MLAG agents sync all MAC and ARP information with the active peer.
Another trigger of mlag reload delay is forwarding plane agent restart. On some platforms like TH or T3, a port speed change needs a hitful agent restart, which forces mlag interfaces to transition and starts reload-delay.
Another trigger of mlag reload delay is forwarding plane agent restart. On some platforms like TH or T3, a port speed change needs a hitful agent restart, which forces mlag interfaces to transition and starts reload-delay.
From 4.15.2F, the default reload-delay timers are different per platform:
- All fixed systems: 300 sec
- 7500* (Arad/Jericho): 1800 sec (due to long hw initialization time)
- 7300* (Trident*/TH): 1200 sec
When to start the reload timer? The timers start to tick after the start of MLAG agent. In old releases, it is triggered by the sysdb agent.
Can I lower the reload-delay timers? Yes, but be careful since if the interfaces exit err-disabled mode before sync is done, the mlag peer will blackhole the traffic.
Which value should be used, it is highly based on system and configuration. You can look at the log messages closely.
! this is MLAG agent up and timers are on
Apr 19 11:33:32 localhost Mlag: %AGENT-6-INITIALIZED: Agent 'Mlag' initialized; pid=3030
! LCs power on
Apr 19 11:34:08 R1 NorCalCard: %HARDWARE-6-CARD_POWERED_ON: Card Fabric3 has been powered on. model: 7512R-FM rev: 11.02 serial number: JPE16305615
....
Apr 19 11:34:24 R1 NorCalCard: %HARDWARE-6-CARD_POWERED_ON: Card Linecard11 has been powered on. model: 7500R-36Q-LC rev: 02.01 serial number: JPE16204252
! LC initialization
Apr 19 11:38:40 R1 SandFap: %SAND-6-INIT_SUCCEEDED: Initialization of Linecard12 switch asics succeeded.
....
Apr 19 11:43:12 R1 SandFap: %SAND-6-INIT_SUCCEEDED: Initialization of Linecard13 switch asics succeeded.
! interfaces up
Apr 19 11:43:01 R1 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet12/26/3 (mlag.207_leaf_et9/3_et10/3=>et12/17/3), changed state to up
....
Apr 19 11:46:33 R1 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet14/34/1 (peerLink=>mlagSec.et14/34/1.100g), changed state to up
This is the messages of a 7512N with 8 J/J+ LCs. So you can see, 13 mins after MLAG agent up, the last peer-link interface is up. Given 5 mins to allow IGP/iBGP session and corresponding hw programming finished, a 20-min (1200 sec) reload-delay timer should be safe with an additional 2-minute buffer.
And there are 2 timers:
- MLAG reload-delay timer: this is for all MLAG port-channel links. It can be changed by CLI - "reload-delay mlag <seconds>"
- Non-MLAG reload-delay timer: most of the time, they are the timeout values for L3 uplinks. And can be modified by CLI - "reload-delay non-mlag <seconds>"
Before discussing how to tune them, let me give a good example to help your understandings:
- Imagine there is a house which has a front door (non-MLAG/uplinks to go out) and a back door (MLAG interfaces/to reach hosts/tenants)
- And there is a side to your neighbor (MLAG peer), which shares the same tenants/hosts.
- So the first thing to do is to communicate with your neighbor to have all the address information, to know who is where.
- With all the knowledge, open which door first?
- First of all, the side door is opened first. (so peer-link doesn't have reload-delay and a BGP/IGP peering is required)
- During the gap of the front and back door, the traffic will not be lost since there is a side door to exit.
- The opening of the back door will have half south-north traffic in, and will go thru the side door. Similar to front door, half north-south traffic.
- Most of the time, the MLAG interfaces are facing servers which means south-north traffic is much higher than the opposite traffic.
- So it is preferable to configure non-mlag reload delay <= mlag timer.
- The ONLY exception is, if "reload-delay mode lacp-standby" is enabled, non-mlag timer > mlag timer. This feature keeps the LACP interfaces up to speed up hardware programming. So if the north-south coming first, the router has to drop them because the MLAG port-channels are not really ready.
Thanks for testing it out for us? :)
ReplyDelete