In the previous blog, we see the host1 and host2 can NOT ping vtep3's SVI - VLAN2000 within the same VLAN. But host3 can reach this SVI. Why this happens?
This is because the Trident II ASIC doesn't support routing with overlay, recirculation channel is needed to loop the VXLAN inter-VLAN traffic back to the pipeline for routing lookup on some Arista switches like 7050QX.
Bridging or Routing?
Based on the dstMAC, the ASIC determines the incoming packets to go to bridging or routing. The ping/ICMP packets from host2 to vtep3 will proceed vxlan decapsulation. Because its dstMAC is to routerMAC of vtep3, it is routing. So T2 can't handle vxlan decap and routing in 1 pass, thus ping failed
But if host2 pings host3, after vxlan decap, the packets are bridged.
Similar host3 pings vtep3, the packets don't go thru vxlan decap, so ping is good.
Which platform needs recirculation?
Only all Trident-2 or TH based platforms have this limitation. From above topology, other vteps like Jericho, T2+ doesn't need this.
How to tell the chip model? The best way is to ask the account engineer who serves your account. Another way is to run the following CLI (based on my own experiences, if you know a better one please comment here. thanks!)
How to tell the chip model? The best way is to ask the account engineer who serves your account. Another way is to run the following CLI (based on my own experiences, if you know a better one please comment here. thanks!)
7280QR-C36-F(config)#sh platform fap
.....
Jericho0 !!! clearly this is a Jericho-based
7050QX-32-F#show platform fap
% Invalid input !!! FAP = Sand/Petra/Arad/Jericho, not supported
7050QX-32-F#show platform trident sys !! well this is a Trident
Slice Chip ModId GenId
----------------- ----------------- ----------- -----
FixedSystem Linecard0/0 1 1
------------------------------------------------------
Front panel vs internal ports
On the T2 system, the circulation can be done by front panel port and internal ports, depending on the switch model. A T2 chip can support 32 x 40G ports, some platforms like 7050TX-72/96, 7050SX/72/96, 7050S-64 don't use all ports at front panel, while the remaining ports are called internal ports. Using internal ports is definitely better than front panel, because it doesn't impact your switch connectivity capacity.
So the next question is, how to tell if this switch has internal ports:-) Use CLI - "show inventory".
7050SX-64-F.10:36:59(config)#show inventory
System has 81 ports
Type Count
---------------- ----
Management 1
Switched 64
Unconnected 16 !!! has 16 unconnected ports
7050QX-32-F(config)#show inventory
System has 105 ports
Type Count
---------------- ----
Management 1
Switched 104 !!! No unconnected ports
Configuration:
Step1: Expose all internal ports (if the system has Unconnected ports under "show inventory", actually we don't need this for vtep3)
mLeafB.cd631.Z(config)#service interface unconnected expose
mLeafB.cd631.Z(config)#switch scheduler oversubscribed
Step2: Configure Recirc-channel (if T2 system)
upp224.vtep3(config)#int recirc-Channel 1
upp224.vtep3(config-if-Re1)#switchport recirculation features vxlan
Step3: Assign physical (front panel or internal) ports to recirc-channel
upp224.vtep3(config-if-Re1)#int et34
upp224.vtep3(config-if-Et34)#traffic-loopback source system device mac
upp224.vtep3(config-if-Et34)#channel-group recirculation 1
upp224.vtep3(config-if-Et34)#
Step4: Verify
upp224.vtep3#sh int recirc-Channel 1
Recirc-Channel1 is up, line protocol is up (connected)
Hardware is Port-Channel, address is 2899.3a8b.e6fa
Ethernet MTU 9214 bytes , BW 10000000 kbit
Full-duplex, 10Gb/s
Active members in this channel: 1
... Ethernet34 , Full-duplex, 10Gb/s
Fallback mode is: off
Step5: ping from remote hosts
wa466.host2(vrf:host2)#ping 20.0.9.253
PING 20.0.9.253 (20.0.9.253) 72(100) bytes of data.
80 bytes from 20.0.9.253: icmp_seq=1 ttl=64 time=0.218 ms
80 bytes from 20.0.9.253: icmp_seq=2 ttl=64 time=0.150 ms
80 bytes from 20.0.9.253: icmp_seq=3 ttl=64 time=0.109 ms
80 bytes from 20.0.9.253: icmp_seq=4 ttl=64 time=0.107 ms
80 bytes from 20.0.9.253: icmp_seq=5 ttl=64 time=0.106 ms
--- 20.0.9.253 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.106/0.138/0.218/0.043 ms, ipg/ewma 0.195/0.175 ms
mLeafB.cd631.Z(config)#service interface unconnected expose
mLeafB.cd631.Z(config)#switch scheduler oversubscribed
Step2: Configure Recirc-channel (if T2 system)
upp224.vtep3(config)#int recirc-Channel 1
upp224.vtep3(config-if-Re1)#switchport recirculation features vxlan
Step3: Assign physical (front panel or internal) ports to recirc-channel
upp224.vtep3(config-if-Re1)#int et34
upp224.vtep3(config-if-Et34)#traffic-loopback source system device mac
upp224.vtep3(config-if-Et34)#channel-group recirculation 1
upp224.vtep3(config-if-Et34)#
Step4: Verify
upp224.vtep3#sh int recirc-Channel 1
Recirc-Channel1 is up, line protocol is up (connected)
Hardware is Port-Channel, address is 2899.3a8b.e6fa
Ethernet MTU 9214 bytes , BW 10000000 kbit
Full-duplex, 10Gb/s
Active members in this channel: 1
... Ethernet34 , Full-duplex, 10Gb/s
Fallback mode is: off
Step5: ping from remote hosts
wa466.host2(vrf:host2)#ping 20.0.9.253
PING 20.0.9.253 (20.0.9.253) 72(100) bytes of data.
80 bytes from 20.0.9.253: icmp_seq=1 ttl=64 time=0.218 ms
80 bytes from 20.0.9.253: icmp_seq=2 ttl=64 time=0.150 ms
80 bytes from 20.0.9.253: icmp_seq=3 ttl=64 time=0.109 ms
80 bytes from 20.0.9.253: icmp_seq=4 ttl=64 time=0.107 ms
80 bytes from 20.0.9.253: icmp_seq=5 ttl=64 time=0.106 ms
--- 20.0.9.253 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.106/0.138/0.218/0.043 ms, ipg/ewma 0.195/0.175 ms

No comments:
Post a Comment