This post describes how the Multi-Chassis Link Aggregation Group (MC-LAG) technology using virtual PortChannel (vPC) works in a VXLAN BGP EVPN fabric. I will first go through the vPC configuration with a short explanation and then I’ll show the Control- and Data Plane operation from VXLAN BGP EVPN perspective by using various show commands and packet capture. I am also going to explain the “Advertising VIP/PIP” options using the external connection. Example topology is shown in Figure 9-1. Complete configurations of vPC peer switches Leaf-102 and Leaf-103 (Leaf-101 and Spine-11 configuration are the same than in the previous post) can be found from the Appendix 1 at the end of the post.
Figure 9-1: VXLAN BGP EVPN vPC Example Topology and IP addressing
Virtual Port Channel
I’ll start by doing the vPC configurations. We have two vPC VTEP switches Leaf-102 and Leaf-103. Inter-switch links, vPC related link terms and IP addressing and PortChannel numbering can be seen from the figure 9-2.
Figure 9-2: vPC domain
Step 1: Enable vPC and LACP features on both vPC VTEP switches.
feature vpc
feature lacp
|
Example 9-1: vPC and LACP features
Step 2: Configure Peer-Keepalive link.
VPC Peer-Keepalive link is used as a heartbeat link between the vPC peers to make sure that both vPC peers are alive. I am using dedicated VRF “VPC-Peer-Keepalive” for vPC Peer-Keepalive link. Example 9-2 shows the configuration of vCP VTEP Leaf-102.
vrf context VPC-Peer-Keepalive
!
interface Ethernet1/6
no switchport
vrf member VPC-Peer-Keepalive
ip address 10.102.103.102/24
no shutdown
!
vpc domain 23
peer-keepalive destination 10.102.103.103 source 10.102.103.102 vrf VPC-Peer-Keepalive
|
Example 9-2: vPC Peer-Keepalive (Leaf-102)
Step 2.1: Verify Peer-Keepalive link operation
Leaf-102# show vpc peer-keepalive
vPC keep-alive status : peer is alive
--Peer is alive for : (685) seconds, (480) msec
--Send status : Success
--Last send at : 2018.08.11 09:38:44 791 ms
--Sent on interface : Eth1/6
--Receive status : Success
--Last receive at : 2018.08.11 09:38:45 314 ms
--Received on interface : Eth1/6
--Last update from peer : (0) seconds, (293) msec
vPC Keep-alive parameters
--Destination : 10.102.103.103
--Keepalive interval : 1000 msec
--Keepalive timeout : 5 seconds
--Keepalive hold timeout : 3 seconds
--Keepalive vrf : VPC-Peer-Keepalive
--Keepalive udp port : 3200
--Keepalive tos : 192
|
Example 9-3: vPC Peer-Keepalive (Leaf-102) status check
Note! We create vPC domain 23 in step 2. VPC peer switches will automatically create a unique vPC system MAC address. The vPC system MAC address has a fixed part = 0023.04ee.be.xx and the two last digits (xx) are taken from vPC domain ID. Our example vPC domain has ID 23, which HEX format is 17. So the vPC system address in our example will be 0023.04ee.be17. This can be verified from both switches. As can be seen from the examples 9-4 and 9-5 there are also vPC local system-mac. The vPC system MAC is common for both vPC peer switches and it is represented when the vPC system, formed by two vPC peer switches, represents itself as a unit. The vPC local system-mac is unique per vPC peer switch and it is used when switch presents itself as an individual switch, not as a vPC system. This is the case with Orphan ports for example.
Leaf-102# sh vpc role
vPC Role status
----------------------------------------------------
vPC role : primary
Dual Active Detection Status : 0
vPC system-mac : 00:23:04:ee:be:17
vPC system-priority : 32667
vPC local system-mac : 5e:00:00:01:00:07
vPC local role-priority : 32667
vPC local config role-priority : 32667
vPC peer system-mac : 5e:00:00:06:00:07
vPC peer role-priority : 32667
vPC peer config role-priority : 32667
|
Example 9-4: vPC system MAC Leaf-102
Leaf-103# sh vpc role
vPC Role status
----------------------------------------------------
vPC role : secondary
Dual Active Detection Status : 0
vPC system-mac : 00:23:04:ee:be:17
vPC system-priority : 32667
vPC local system-mac : 5e:00:00:06:00:07
vPC local role-priority : 32667
vPC local config role-priority : 32667
vPC peer system-mac : 5e:00:00:01:00:07
vPC peer role-priority : 32667
vPC peer config role-priority : 32667
|
Example 9-5: vPC system MAC Leaf-103
Step 3: Create vPC Peer-Link
VPC Peer-Link is an 802.1Q trunk link that carries vPC and non-vPC VLANs, Cisco Fabric Service Messages (consistency check, MAC address synchronization, advertisement of vPC member port status, STP management and synchronization of HSRP and IGMP snooping), flooded traffic from the peer vPC, STP BPDUs, HSRP Hello messages and IGMP updates. In our example, we create Port-Channel 23 for the vPC peer-link. We are going to use LACP as a channel protocol.
interface port-channel23
switchport mode trunk
spanning-tree port type network
vpc peer-link
!
interface Ethernet1/4
description ** Po23 member - vPC PEER-link **
switchport mode trunk
channel-group 23 mode active
!
interface Ethernet1/5
description ** Po23 member - vPC PEER-link **
switchport mode trunk
channel-group 23 mode active
|
Example 9-6: vPC Peer-Link on switch Leaf-102
Note! If vPC peer link goes down while vPC Peer-Keepalive link is still up, the secondary switch suspends its vPC member port and shuts down the SVI associated to the vPC VLAN. Once this failure happens, Orphan ports connected to Secondary switch will be isolated. That is the reason for the recommendation to connect Orphan hosts to Primary switch.
|
Step 4: Configure vPC Member Ports
From the access device perspective, Ethernet switch in our example, the uplink port is a classical Ether-Channel while from the vPC VTEP point of view the link to access device is attached to a vPC Member Port. It is recommended to use the Link Aggregation Control Protocol (LACP) as a channel protocol because it is a standard protocol and it has built-in misconfiguration protection and fast failure detection mechanism.
Note! I am using Cisco VIRL where an access device OS is vios_l2 with Experimental Version 15.2(20170321:233949). I did not manage to bring up the Uplink Port-Channel in vios_l2 switch. While trying to form a channel, the switch generates the syslog message “%EC-5-L3DONTBNDL2: Gi0/2 suspended: LACP currently not enabled on the remote port.” This message might be related to bug documented in CSCva22545 (https://bst.cloudapps.cisco.com/bugsearch/bug/CSCva22545) where there are two affected releases and the other one is 15.2(3.7.4)PIH19. That is why I am using manual mode on both switches.
|
interface port-channel10
switchport mode trunk
vpc 10
!
interface Ethernet1/3
description ** Link to Ethernet SW **
switchport mode trunk
channel-group 10
|
Example 9-7: vPC Member Port on Leaf-102 and Leaf-103
Step 5: Verification of vPC operational status.
Leaf-102# sh vpc
Legend:
(*) - local vPC is down, forwarding via vPC peer-link
vPC domain id : 23
Peer status : peer adjacency formed ok
vPC keep-alive status : peer is alive
Configuration consistency status : success
Per-vlan consistency status : success
Type-2 consistency status : success
vPC role : primary
Number of vPCs configured : 1
Peer Gateway : Enabled
Dual-active excluded VLANs : -
Graceful Consistency Check : Enabled
Auto-recovery status : Disabled
Delay-restore status : Timer is off.(timeout = 30s)
Delay-restore SVI status : Timer is off.(timeout = 10s)
Operational Layer3 Peer-router : Disabled
vPC Peer-link status
---------------------------------------------------------------------
id Port Status Active vlans
-- ---- ------ -------------------------------------------------
1 Po23 up 1,10,20,77
vPC status
----------------------------------------------------------------------------
Id Port Status Consistency Reason Active vlans
-- ------------ ------ ----------- ------ ---------------
10 Po10 up success success 1,10,20,77
|
Example 9-8: vPC verification
Step 6: Configure vPC peer-gateway under vpc domain configuration
Some devices, such as NAS and Load Balancer, might not perform standard ARP-request for IP of the default gateway during the boot process. Instead of it, they take the first source MAC address that they hear from the wire and then bind that MAC address with the IP address of the default gateway. This kind of behavior might cause forwarding problems.
In figure 9-3, we have two hosts, Host-A in VLAN 10 and Host-B in VLAN 20. Let's say that Host-B is a NAS device that binds the first source MAC that it hears, to the default gateway IP. (1) vPC VTEP Leaf-102 send some data towards host-B. (2) Host-B has just booted up and it receives the frame sent from Leaf-103 and binds the source MAC address from the frame to the default gateway IP. (3) Then Host-B starts sending data to Host-A which is in VLAN 10, so the IP packet is sent to the default gateway. (4) The Ethernet switch, where the Host-B is connected, receives the IP packet and runs the channel hash algorithm and choose the link towards vPC VTEP Leaf-103. (5) Leaf-103 receives the IP packet and since the destination MAC address in Ethernet header belongs to Leaf-102, the IP packet is sent over the vPC Peer-link to the vPC VTEP peer Leaf-102. (6) Now the loop prevention mechanism kicks in, the data received from the vPC member port and the crossing over vPC peer-link is not allowed to send out to any vPC member port. So in our case, Leaf-102 drops the data packet.
Note! There is one exception to loop prevention mechanism “frame received from vPC member ports and crossed over vPC Peer-Link are not allowed to egress from vPC member port”. If vPC member port between Leaf-103 and host A is down, then the frame is allowed to egress from Leaf-102 port e1/3.
|
By using the vPC Peer-Gateway option, Leaf-103 is allowed to act as an active default gateway in VLAN 10 (and of course in VLAN 20) also in a situation where the IP packet received over the vPC member port has a destination MAC address, that belongs to the vPC peer Leaf-102. So in our example, Leaf-103 is allowed to send data packet straight to Host-A without sending it to vPC peer Leaf-102.
Figure 9-3: vPC peer-gateway
Vpc domain 23
Peer-gateway
|
Example 9-9: vPC peer-gateway configuration
Step 7: Configure ARP sync under vpc domain configuration
ARP sync is used to synchronize the MAC address information in recovery situation after the vPC Peer-link has failed. Synchronization is done by using the Cisco Fabric Service protocol (CFS). The direction is from the primary vPC peer (Leaf-102 in our lab) to the secondary vPC peer (Leaf-103).
Vpc domain 23
ip arp synchronize
|
Example 9-10: vPC ARP synch configuration
Step 6: Tune vPC Delay Restore timers (optional)
By using vPC delay restore, the vPC peer switch holds down the vPC links and SVIs until the routing protocols are converged. This property is enabled by default with timer value 30 and 10 seconds (vPC link/SVI). We are using timers 240/80. These values are related to a size of the network.
vpc domain 23
delay restore 240
delay restore interface-vlan 80
|
Example 9-11: vPC Delay Restore configuration
Some other consideration for vPC:
Since the primary subject of this post is to show how the VXLAN BGP EVPN works with vPC I am not going to show each and every vPC features in detail but still, here are some other consideration when implementing vPC.
The vPC priority should be statically defined in both primary and secondary vPC peer switch. This way we know which one is the primary switch. Orphan hosts should be connected to primary vPC peer switch. By doing this they are not restricted from the network in case of vPC Peer-link failure. If First Hop Redundancy Protocol such as HSRP is used, set the STP root and HSRP primary to the vPC primary switch.
At his moment, we have done following configuration related to vPC on switch Leaf-102 and Leaf-103.
feature vpc
!
vpc domain 23
peer-switch
peer-keepalive destination 10.102.103.103 source 10.102.103.102 vrf VPC-Peer-K
eepalive
delay restore 240
peer-gateway
delay restore interface-vlan 80
ip arp synchronize
!
interface port-channel10
vpc 10
!
interface port-channel23
vpc peer-link
!
interface Ethernet1/6
no switchport
vrf member VPC-Peer-Keepalive
ip address 10.102.103.102/24
no shutdown
!
interface Ethernet1/4
description ** Po23 member - vPC PEER-link **
switchport mode trunk
channel-group 23 mode active
!
interface Ethernet1/5
description ** Po23 member - vPC PEER-link **
switchport mode trunk
channel-group 23 mode active
!
interface Ethernet1/3
description ** Link to Ethernet SW **
switchport mode trunk
channel-group 10
|
Example 9-12: vPC Delay Restore configuration
VTEP redundancy with vPC
When vPC is implemented into VXLAN fabric, both vPC VTEP peers start using Virtual IP (VIP) address as a source address instead of their physical IP address (PIP) for. This also means that BGP EVPN starts advertising both Route Types 2 (MAC/IP advertisement) and 5 (IP prefix-route) with VIP as a next-hop (default behavior). There are two IP addresses configured into Loopback 0 interface, Primary IP 192.168.100.102/32 (PIP) and secondary IP 192.168.100.23/32 (VIP) in our example lab (figure 9-4).
Figure 9-4: vPC PIP and VIP addressing
First I will configure the same secondary IP address 192.168.100.23 under Loopback 100 interface on both vPC VTEP switches. An example is taken from VTEP-102. At this phase, we are not going to do any other configuration.
interface loopback100
description ** VTEP/Overlay **
ip address 192.168.100.102/32
ip address 192.168.100.23/32 secondary
ip router ospf UNDERLAY-NET area 0.0.0.0
ip pim sparse-mode
|
Example 9-12: Secondary IP address into Loopback 0
Now we will attach host Cafe into Ethernet switch and verify Control Plane operation by capturing traffic from wire to see the BGP EVPN MAC/IP advertisements. Then we will verify the Data Plane operation by pinging from Cafe to Beef.
Phase-1: Host Cafe boot up and sends a Gratuitous ARP message to verify the uniqueness of its IP address and inform the location of its own MAC address. Channel hash algorithm selects the interface g0/1 and the broadcast messages are sent out of it.
Phase-2: Leaf-102 receives the first broadcast frame. Its L2FWDER component notices the incoming frame from the interface Po10, with source MAC address 1000.0010.cafe. After MAC address-table update the L2FWDER installs the MAC address to L2RIB where it is sent to BGP EVPN process. Since LEAF-102 has vPC peer switch Leaf-103, it synchronizes the MAC address-table over the vPC Peer Link with CFS. This way Leaf-103 learns that MAC address 1000.0010.cafe is located behind PortChannel 10. Since the destination of the frame is a broadcast address, it is also flooded over the vPC Peer-link. These broadcast messages are also sent to the corresponding multicast group of VNI 10000.
Note! The detailed descriptions about update process of MAC-address table, L2RIB, BGP-table, and RIB as well as BGP EVPN process can be found from my previous post “VXLAN Part VII: VXLAN BGP EVPN –Control Plane operation”.
BUM (Broadcast, Unknown Unicast and Multicast) traffic processes in VXLAN are explained in VXLAN Part V: Flood and Learn
|
Phase-3: At this phase, the L2FWDER component has sent MAC address information from L2RIB to BGP EVPN process in both switches. They both send two BGP EVPN Route Type-2 Update to the Spine switch Spine-11, first one is the host Cafe MAC-address and the second on is MAC/IP information. For simplicity, we concentrate only on MAC address advertisements sent by vPC peer switches Leaf-102 and Leaf-103. The BGP EVPN Update messages can be seen in Capture 9-1 (Leaf-102) and 9-2 (Leaf-103) right after the figure 9-5. From these captures, we can see that the Path Attribute MP_REACH_NLRI Path Attribute Next-Hop is set 192.168.100.23. This information is not visible in the captured packet as binary mode, but it can be found from the HEX part, where we can see the HEX value c0 a8 64 17 (192.168.100.23 in Binary). Note that the EVPN NLRI Route Distinguisher includes the original sender RID which is how the Spine switch can differentiate these updates.
Phase-4: Spine-11 sends these BGP EVPN Updates to Leaf-101 without modification of the Path Attribute, it just adds a Cluster List Path Attribute, which is used as a loop prevention mechanism (Spine-11 is a Route Reflector). If we compare these updates received from Leaf-102 and Leaf-103, the only notable difference in BGP EVPN Update is the Route Distinguisher (RD). By checking the RD value, the Spine-11 knows that updates are from different Leaf switches (Detailed explanation can be found from part VXLAN Part VII: VXLAN BGP EVPN –Control Plane operation).
Phase-5: From the Capture 9-3, taken from the Leaf-103 interface e1/1, we can see that the RR Spine-11 sends the BGP EVPN Update about host Cafe MAC, sent by Leaf-102, to Leaf-103. This Update is blocked by Leaf-103 based Site of Origin (SoO) Extended Community Attribute 192.168.100.23:0 (Route Origin field in Capture).
Figure 9-5: BGP EVPN Update
Capture 9-1: BGP EVPN Update from Leaf-102 to Spine-11
Capture 9-2: BGP EVPN Update from Leaf-103 to Spine-11
Capture 9-3: BGP EVPN Update originated by Leaf-102 and to Leaf-103 by Spine-11.
From the output of example 9-13, we can see that the host Cafe MAC- and IP information is produced to L2RIB by BGP with the next-hop address of VIP/Anycast address of vCP domain 23 switches Leaf-102 and Leaf-103. Same output also shows that the MAC-IP binding is sent to ARP cache, actually to ARP suppression-cache.
Leaf-101# sh l2route evpn mac-ip evi 10 detail
Flags -(Rmac):Router MAC (Stt):Static (L):Local (R):Remote (V):vPC link
(Dup):Duplicate (Spl):Split (Rcv):Recv(D):Del Pending (S):Stale (C):Clear
(Ps):Peer Sync (Ro):Re-Originated
Topology Mac Address Prod Flags Seq No Host IP Next-Hops
----------- -------------- ------ ---------- --------------- ---------------
10 1000.0010.cafe BGP -- 0 192.168.11.11 192.168.100.23
Sent To: ARP
SOO: 775043377
10 1000.0010.beef HMM -- 0 192.168.11.12 Local
Sent To: BGP
L3-Info: 10077
|
Example 9-13: Leaf-101 L2RIB
Example 9-14 shows the ARP suppression-cache
Leaf-101# sh ip arp suppression-cache detail
Flags: + - Adjacencies synced via CFSoE
L - Local Adjacency
R - Remote Adjacency
L2 - Learnt over L2 interface
PS - Added via L2RIB, Peer Sync
RO - Dervied from L2RIB Peer Sync Entry
Ip Address Age Mac Address Vlan Physical-ifindex Flags Remote Vtep Addrs
192.168.11.12 00:04:30 1000.0010.beef 10 Ethernet1/4 L
192.168.11.11 01:28:40 1000.0010.cafe 10 (null) R 192.168.100.23
|
Example 9-14: ARP suppression-cache on Leaf-101.
Ping shows that we have IP connectivity between Cafe and Beef in VLAN 10
Beef#ping 192.168.11.11
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.11.11, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 10/15/21 ms
|
Example 9-15: ping from host Beef to Host Cafe
From the Capture 9-4, we can see that the ICMP messages IP header destination address in VXLAN Tunnel header is correctly set to 192.168.100.23 as it should be.
Capture 9-4: Ping from Beef to Cafe.
This is the basic operation vPC is implemented in VXLAN BGP EVPN fabric.
Advertising Primary IP address
In figure 9-6, there is an external network behind the vPC peer VTEP Leaf-103. This kind of setup might lead to the situation, where the spine switch Spine-11 sends data flow to an external network to the vPC Peer switch Leaf-102 that has no the how to route the packet and the data flow is black-holed. This might happens since in a basic setup, both vPC peers send a BGP EVPN Updates to Spine-11 with VIP/Any address. Note that vPC peers switches do not have a synchronization mechanism to synchronize Layer 3 prefix information.
The process is shown in Figure 9-6. (1) Router Ext-Ro02 sends the BGP Update about network 172.16.77.0/24 to Leaf-103, which in turns forwards the Update to the Spine-11 by using its VIP/Anycast IP 192.168.100.23 as a next-hop. (3) Spine-11 send the Update to its BGP RR Clients Leaf-101 and Leaf-102. Leaf-102 ignores the Update (Same SoO) and Leaf-101 installs the received information in TENANT77 specific tables (BGP, RIB). That is a simplified Control Plane operation. (4) Then the Data Plane, Host Beef sends data to a host located in the network 172.16.77.0/24. It sends the IP packets to its Default Gateway Leaf-101, which knows that the destination network is reachable through the next-hop address 192.168.100.23 (vCP peers Leaf-102 and Leaf-103 VIP/Anycast address). (5) Leaf-101 sends the packet to Spine-11. Spine-11 has two possible paths towards the next-hop 192.168.100.23, to either via Leaf-102 or via Leaf-103. It might select the path to Leaf-102, which does not know how to route the packet to destination network 172.16.77.0/24.
Figure 9-6: BGP EVPN Update
From the Example 9-16, we can see that the external network is advertised with the next-hop address of vPC peer switches VIP/Anycast address 192.168.100.23.
Leaf-101# sh ip bgp vrf TENANT77 172.16.77.0
BGP routing table information for VRF TENANT77, address family IPv4 Unicast
BGP routing table entry for 172.16.77.0/24, version 6
Paths: (1 available, best #1)
Flags: (0x8008041a) on xmit-list, is in urib, is best urib route, is in HW
vpn: version 6, (0x100002) on xmit-list
Advertised path-id 1, VPN AF advertised path-id 1
Path type: internal, path is valid, is best path, in rib
Imported from 192.168.77.103:3:[5]:[0]:[0]:[24]:[172.16.77.0]:[0.0.0.0]/224
AS-Path: 64577 , path sourced external to AS
192.168.100.23 (metric 81) from 192.168.77.11 (192.168.77.111)
Origin IGP, MED 0, localpref 100, weight 0
Received label 10077
Extcommunity: RT:65000:10077 ENCAP:8 Router MAC:5e00.0006.0007
Originator: 192.168.77.103 Cluster list: 192.168.77.111
<snipped>
|
Example 9-16: BGP table Leaf-101
This behavior can be changed in a way that instead of advertising VIP as a Next-Hop for external prefixes, the PIP (Primary/Physical IP) is used. This is achieved with commands Advertise-pip under BGP AFI together with advertise virtual-rmac under NVE interface which together lets BGP use Primary IP address as a next-hop when advertising prefix-routes. These commands are enabled on both vPC peers switch.
router bgp 65000
address-family l2vpn evpn
advertise-pip
!
interface nve1
advertise virtual-rmac
|
Example 9-17: Advertise-pip and advertise virtual-rmac
After changes, we can see from the Leaf-101 that the next-hop is set to Primary IP (PIP) of Leaf-103.
Leaf-101# sh ip bgp vrf TENANT77 172.16.77.0
BGP routing table information for VRF TENANT77, address family IPv4 Unicast
BGP routing table entry for 172.16.77.0/24, version 19
Paths: (1 available, best #1)
Flags: (0x8008041a) on xmit-list, is in urib, is best urib route, is in HW
vpn: version 21, (0x100002) on xmit-list
Advertised path-id 1, VPN AF advertised path-id 1
Path type: internal, path is valid, is best path, in rib
Imported from 192.168.77.103:3:[5]:[0]:[0]:[24]:[172.16.77.0]:[0.0.0.0]/224
AS-Path: 64577 , path sourced external to AS
192.168.100.103 (metric 81) from 192.168.77.11 (192.168.77.111)
Origin IGP, MED 0, localpref 100, weight 0
Received label 10077
Extcommunity: RT:65000:10077 ENCAP:8 Router MAC:5e00.0006.0007
Originator: 192.168.77.103 Cluster list: 192.168.77.111
VRF advertise information:
Path-id 1 not advertised to any peer
VPN AF advertise information:
Path-id 1 not advertised to any peer
|
Example 9-18: Advertise-pip and advertise virtual-rmac.
By using Advertise-pip and advertise virtual-rmac commands, the next-hop operation changes a little bit. From the Capture 9-5, we can see that MAC Advertisement Route (Type-2) still use VIP as next-hop (HEX 17 = DEC 23).
Capture 9-5: Route Type-2
While IP prefix route (Type-5) uses PIP as next-hop address (HEX 67 = DEC 103)
Capture 9-6: Route Type-5
We can verify this also from the BGP table of Leaf-101. Host Cafe related MAC and MAC/IP has 192.168.100.23 (VIP) as a next-hop, while external network 172.16.77.0/24 has 192.168.100.103 (PIP) as a next-hop address.
Leaf-101# show bgp l2vpn evpn
BGP routing table information for VRF default, address family L2VPN EVPN
BGP table version is 346, Local Router ID is 192.168.77.101
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-injected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup
Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 192.168.77.101:32777 (L2VNI 10000)
*>l[2]:[0]:[0]:[48]:[1000.0010.beef]:[0]:[0.0.0.0]/216
192.168.100.101 100 32768 i
* i[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216
192.168.100.23 100 0 i
*>i 192.168.100.23 100 0 i
*>l[2]:[0]:[0]:[48]:[1000.0010.beef]:[32]:[192.168.11.12]/272
192.168.100.101 100 32768 i
* i[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[192.168.11.11]/272
192.168.100.23 100 0 i
*>i 192.168.100.23 100 0 i
Route Distinguisher: 192.168.77.102:3
*>i[5]:[0]:[0]:[24]:[192.168.11.0]:[0.0.0.0]/224
192.168.100.102 100 0 i
Route Distinguisher: 192.168.77.102:32777
*>i[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216
192.168.100.23 100 0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[192.168.11.11]/272
192.168.100.23 100 0 i
Route Distinguisher: 192.168.77.103:3
*>i[5]:[0]:[0]:[24]:[172.16.77.0]:[0.0.0.0]/224
192.168.100.103 0 100 0 64577 i
*>i[5]:[0]:[0]:[24]:[192.168.11.0]:[0.0.0.0]/224
192.168.100.103 100 0 i
Route Distinguisher: 192.168.77.103:32777
*>i[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216
192.168.100.23 100 0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[192.168.11.11]/272
192.168.100.23 100 0 i
Route Distinguisher: 192.168.77.101:3 (L3VNI 10077)
* i[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[192.168.11.11]/272
192.168.100.23 100 0 i
*>i 192.168.100.23 100 0 i
*>i[5]:[0]:[0]:[24]:[172.16.77.0]:[0.0.0.0]/224
192.168.100.103 0 100 0 64577 i
* i[5]:[0]:[0]:[24]:[192.168.11.0]:[0.0.0.0]/224
192.168.100.103 100 0 i
*>i 192.168.100.102 100 0 i
|
Example 9-18: Advertise-pip and advertise virtual-rmac
One more thing, if we take a look at the TENANT77 BGP table in Example 9-19, we see that also local prefixes are advertised by using PIP.
Leaf-101# sh ip bgp vrf TENANT77
BGP routing table information for VRF TENANT77, address family IPv4 Unicast
BGP table version is 25, Local Router ID is 192.168.11.1
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-injected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup
Network Next Hop Metric LocPrf Weight Path
*>i172.16.77.0/24 192.168.100.103 0 100 0 64577 i
* i192.168.11.0/24 192.168.100.103 100 0 i
*>i 192.168.100.102 100 0 i
* i192.168.11.11/32 192.168.100.23 100 0 i
*>i 192.168.100.23 100 0 i
|
Example 9-19: TENANT77 BGP table
Author: Toni Pasanen CCIE#28158
Published: 19-August 2018
Edited: August 19-August 2018 | Toni Pasanen
-------------------------------------------------
References:
Building Data Center with VXLAN BGP EVPN – A Cisco NX-OS Perspective
ISBN-10: 1-58714-467-0 – Krattiger Lukas, Shyam Kapadia, and Jansen Davis
NX-OS and Cisco Nexus Switching – Next-Generation Data Center Architectures
Second Edition
ISBN-10: 1-58714-304-6 – Ron Fuller, David Jansen, and Matthew McPherson
Design and Configuration Guide: Best Practices for Virtual Port Channels (vPC) on Cisco Nexus7000 Series Switches - Revised: June 2016
LIST OF VPC BEST PRACTICES - Peter Welcher
https://www.netcraftsmen.com/vpc-best-practices-checklist/
Appendix 1.
Configuration of Leaf-102
Leaf-102# sh run
!Command: show running-config
!Time: Sat Aug 18 12:34:00 2018
version 7.0(3)I7(1)
hostname Leaf-102
vdc Leaf-102 id 1
limit-resource vlan minimum 16 maximum 4094
limit-resource vrf minimum 2 maximum 4096
limit-resource port-channel minimum 0 maximum 511
limit-resource u4route-mem minimum 128 maximum 128
limit-resource u6route-mem minimum 96 maximum 96
limit-resource m4route-mem minimum 58 maximum 58
limit-resource m6route-mem minimum 8 maximum 8
cfs eth distribute
nv overlay evpn
feature ospf
feature bgp
feature pim
feature fabric forwarding
feature interface-vlan
feature vn-segment-vlan-based
feature lacp
feature vpc
feature nv overlay
username admin password 5 $5$r25DfmPc$EvUgSVebL3gCPQ8e1ngSTxeKYIk4yuuPIomJKa5Lp/
3 role network-admin
ip domain-lookup
snmp-server user admin network-admin auth md5 0x713961e592dd5c2401317a7e674464ac
priv 0x713961e592dd5c2401317a7e674464ac localizedkey
rmon event 1 description FATAL(1) owner PMON@FATAL
rmon event 2 description CRITICAL(2) owner PMON@CRITICAL
rmon event 3 description ERROR(3) owner PMON@ERROR
rmon event 4 description WARNING(4) owner PMON@WARNING
rmon event 5 description INFORMATION(5) owner PMON@INFO
fabric forwarding anycast-gateway-mac 0001.0001.0001
ip pim rp-address 192.168.238.1 group-list 238.0.0.0/24 bidir
ip pim ssm range 232.0.0.0/8
vlan 1,10,20,77
vlan 10
name L2VNI-for-VLAN10
vn-segment 10000
vlan 20
name L2VNI-for-VLAN20
vn-segment 20000
vlan 77
name TENANT77
vn-segment 10077
spanning-tree vlan 1-3967 priority 4096
vrf context TENANT77
vni 10077
rd auto
address-family ipv4 unicast
route-target both auto
route-target both auto evpn
vrf context VPC-Peer-Keepalive
vrf context management
hardware access-list tcam region racl 512
hardware access-list tcam region arp-ether 256 double-wide
vpc domain 23
peer-switch
peer-keepalive destination 10.102.103.103 source 10.102.103.102 vrf VPC-Peer-K
eepalive
delay restore 240
peer-gateway
delay restore interface-vlan 80
ip arp synchronize
interface Vlan1
no shutdown
no ip redirects
no ipv6 redirects
interface Vlan10
no shutdown
vrf member TENANT77
no ip redirects
ip address 192.168.11.1/24
no ipv6 redirects
fabric forwarding mode anycast-gateway
interface Vlan20
no shutdown
vrf member TENANT77
no ip redirects
ip address 192.168.12.1/24
no ipv6 redirects
fabric forwarding mode anycast-gateway
interface Vlan77
no shutdown
vrf member TENANT77
no ip redirects
ip forward
no ipv6 redirects
interface port-channel10
switchport mode trunk
vpc 10
interface port-channel23
switchport mode trunk
spanning-tree port type network
vpc peer-link
interface nve1
no shutdown
host-reachability protocol bgp
advertise virtual-rmac
source-interface loopback100
member vni 10000
suppress-arp
mcast-group 238.0.0.10
member vni 10077 associate-vrf
member vni 20000
suppress-arp
mcast-group 238.0.0.10
interface Ethernet1/1
no switchport
medium p2p
ip unnumbered loopback0
ip ospf network point-to-point
ip router ospf UNDERLAY-NET area 0.0.0.0
ip pim sparse-mode
no shutdown
interface Ethernet1/2
no switchport
medium p2p
ip unnumbered loopback0
ip ospf network point-to-point
ip router ospf UNDERLAY-NET area 0.0.0.0
ip pim sparse-mode
no shutdown
interface Ethernet1/3
description ** Link to Ethernet SW **
switchport mode trunk
channel-group 10
interface Ethernet1/4
description ** Po23 member - vPC PEER-link **
switchport mode trunk
channel-group 23 mode active
interface Ethernet1/5
description ** Po23 member - vPC PEER-link **
switchport mode trunk
channel-group 23 mode active
interface Ethernet1/6
no switchport
vrf member VPC-Peer-Keepalive
ip address 10.102.103.102/24
no shutdown
interface mgmt0
vrf member management
interface loopback0
description ** RID/Underlay **
ip address 192.168.0.102/32
ip router ospf UNDERLAY-NET area 0.0.0.0
ip pim sparse-mode
interface loopback77
description ** BGP peering **
ip address 192.168.77.102/32
ip router ospf UNDERLAY-NET area 0.0.0.0
interface loopback100
description ** VTEP/Overlay **
ip address 192.168.100.102/32
ip address 192.168.100.23/32 secondary
ip router ospf UNDERLAY-NET area 0.0.0.0
ip pim sparse-mode
line console
line vty
router ospf UNDERLAY-NET
router-id 192.168.0.102
name-lookup
router bgp 65000
router-id 192.168.77.102
timers bgp 3 9
address-family ipv4 unicast
address-family l2vpn evpn
advertise-pip
neighbor 192.168.77.11
remote-as 65000
description ** Spine-11 BGP-RR **
update-source loopback77
address-family l2vpn evpn
send-community extended
vrf TENANT77
address-family ipv4 unicast
advertise l2vpn evpn
aggregate-address 192.168.11.0/24 summary-only
neighbor 10.102.77.1
remote-as 64577
description ** External Network - Ext-Ro01 **
update-source Ethernet1/4.77
address-family ipv4 unicast
send-community
send-community extended
route-map INCOMING_POLICIES_FROM_ExtRo01 in
route-map OUTGOING_POLICIES out
neighbor 10.102.78.2
remote-as 64577
description ** External Network - Ext-Ro02 **
update-source Ethernet1/3.78
address-family ipv4 unicast
send-community
send-community extended
route-map INCOMING_POLICIES_FROM_ExtRo02 in
route-map OUTGOING_POLICIES out
evpn
vni 10000 l2
rd auto
route-target import auto
route-target export auto
vni 20000 l2
rd auto
route-target import auto
route-target export auto
|
Configuration of Leaf-103
Leaf-103# sh run
!Command: show running-config
!Time: Sat Aug 18 12:35:18 2018
version 7.0(3)I7(1)
hostname Leaf-103
vdc Leaf-103 id 1
limit-resource vlan minimum 16 maximum 4094
limit-resource vrf minimum 2 maximum 4096
limit-resource port-channel minimum 0 maximum 511
limit-resource u4route-mem minimum 248 maximum 248
limit-resource u6route-mem minimum 96 maximum 96
limit-resource m4route-mem minimum 58 maximum 58
limit-resource m6route-mem minimum 8 maximum 8
cfs eth distribute
nv overlay evpn
feature ospf
feature bgp
feature pim
feature fabric forwarding
feature interface-vlan
feature vn-segment-vlan-based
feature lacp
feature vpc
feature nv overlay
no password strength-check
username admin password 5 $5$.82HC6Bt$QEpUIVi292elRGmwWNLciK2xa2z13xVwsGhdp2BMU0
D role network-admin
ip domain-lookup
snmp-server user admin network-admin auth md5 0x7f693b750cc7550144b8410e07eecf1d
priv 0x7f693b750cc7550144b8410e07eecf1d localizedkey
rmon event 1 description FATAL(1) owner PMON@FATAL
rmon event 2 description CRITICAL(2) owner PMON@CRITICAL
rmon event 3 description ERROR(3) owner PMON@ERROR
rmon event 4 description WARNING(4) owner PMON@WARNING
rmon event 5 description INFORMATION(5) owner PMON@INFO
fabric forwarding anycast-gateway-mac 0001.0001.0001
ip pim rp-address 192.168.238.1 group-list 238.0.0.0/24 bidir
ip pim ssm range 232.0.0.0/8
vlan 1,10,20,77
vlan 10
name L2VNI-for-VLAN10
vn-segment 10000
vlan 20
name L2VNI-for-VLAN20
vn-segment 20000
vlan 77
name TENANT77
vn-segment 10077
spanning-tree vlan 1-3967 priority 4096
vrf context TENANT77
vni 10077
rd auto
address-family ipv4 unicast
route-target both auto
route-target both auto evpn
vrf context VPC-Peer-Keepalive
vrf context management
hardware access-list tcam region racl 512
hardware access-list tcam region arp-ether 256 double-wide
vpc domain 23
peer-switch
peer-keepalive destination 10.102.103.102 source 10.102.103.103 vrf VPC-Peer-K
eepalive
delay restore 240
peer-gateway
delay restore interface-vlan 80
ip arp synchronize
interface Vlan1
no shutdown
no ip redirects
no ipv6 redirects
interface Vlan10
no shutdown
vrf member TENANT77
no ip redirects
ip address 192.168.11.1/24
no ipv6 redirects
fabric forwarding mode anycast-gateway
interface Vlan20
no shutdown
vrf member TENANT77
no ip redirects
ip address 192.168.12.1/24
no ipv6 redirects
fabric forwarding mode anycast-gateway
interface Vlan77
no shutdown
vrf member TENANT77
no ip redirects
ip forward
no ipv6 redirects
interface port-channel10
switchport mode trunk
vpc 10
interface port-channel23
switchport mode trunk
spanning-tree port type network
vpc peer-link
interface nve1
no shutdown
host-reachability protocol bgp
advertise virtual-rmac
source-interface loopback100
member vni 10000
suppress-arp
mcast-group 238.0.0.10
member vni 10077 associate-vrf
member vni 20000
suppress-arp
mcast-group 238.0.0.10
interface Ethernet1/1
no switchport
medium p2p
ip unnumbered loopback0
ip ospf network point-to-point
ip router ospf UNDERLAY-NET area 0.0.0.0
ip pim sparse-mode
no shutdown
interface Ethernet1/2
no switchport
medium p2p
ip unnumbered loopback0
ip ospf network point-to-point
ip router ospf UNDERLAY-NET area 0.0.0.0
ip pim sparse-mode
no shutdown
interface Ethernet1/3
description ** Link to Ethernet SW **
switchport mode trunk
channel-group 10
interface Ethernet1/4
description ** Po23 member - vPC PEER-link **
switchport mode trunk
channel-group 23 mode active
interface Ethernet1/5
description ** Po23 member - vPC PEER-link **
switchport mode trunk
channel-group 23 mode active
interface Ethernet1/6
no switchport
vrf member VPC-Peer-Keepalive
ip address 10.102.103.103/24
no shutdown
interface Ethernet1/7
description ** to Ext-Ro02 **
no switchport
vrf member TENANT77
ip address 10.103.77.103/24
no shutdown
interface mgmt0
vrf member management
interface loopback0
description ** RID/Underlay **
ip address 192.168.0.103/32
ip router ospf UNDERLAY-NET area 0.0.0.0
ip pim sparse-mode
interface loopback77
description ** BGP peering **
ip address 192.168.77.103/32
ip router ospf UNDERLAY-NET area 0.0.0.0
interface loopback100
description ** VTEP/Overlay **
ip address 192.168.100.103/32
ip address 192.168.100.23/32 secondary
ip router ospf UNDERLAY-NET area 0.0.0.0
ip pim sparse-mode
line console
line vty
router ospf UNDERLAY-NET
router-id 192.168.0.103
name-lookup
router bgp 65000
router-id 192.168.77.103
timers bgp 3 9
address-family ipv4 unicast
address-family l2vpn evpn
advertise-pip
neighbor 192.168.77.11
remote-as 65000
description ** Spine-11 BGP-RR **
update-source loopback77
address-family l2vpn evpn
send-community extended
vrf TENANT77
address-family ipv4 unicast
advertise l2vpn evpn
aggregate-address 192.168.11.0/24 summary-only
neighbor 10.103.77.2
remote-as 64577
description ** External Network - Ext-Ro02 **
address-family ipv4 unicast
send-community
send-community extended
route-map INCOMING_POLICIES_FROM_ExtRo02 in
route-map OUTGOING_POLICIES out
neighbor 10.103.78.1
remote-as 64577
description ** External Network - Ext-Ro01 **
update-source Ethernet1/4.78
address-family ipv4 unicast
send-community
route-map INCOMING_POLICIES_FROM_ExtRo01 in
route-map OUTGOING_POLICIES out
evpn
vni 10000 l2
rd auto
route-target import auto
route-target export auto
vni 20000 l2
rd auto
route-target import auto
route-target export auto
|
You are doing some beautiful work man! Fantastic!
ReplyDeleteThanks for the encouraging words!
DeleteHi!
ReplyDeleteThanks. Could you post the configuration of route-maps? What are you matching?
/Mohammed
Hi Mohammad! Te route-maps actually belongs to the "VXLAN Part VIII: VXLAN BGP EVPN – External Connection" configuration. Route-maps does not have any role on this chapter.
DeleteHi Toni!
ReplyDeleteThanks your respons! This helped me alot to understand the VXLAN BGP EVPN and i just want to thank you again. Thanks. Thanks.
I do have another question about how the traffic is going on the overlay:
Imagine if you are in the leaf101-router "not the Beef machine" and want to ping 192.168.11.11, how the traffic will go or do you get any respons back from 192.168.11.11?
Second question.
Imagine if you are in leaf103 "not the Cafe machine" and the link e1/3 towards cafes swicth is down and you want to ping 192.168.11.11, what will happen there? could you get any respons från 192.168.11.11 and how?
The last question:
If i don´t need to use multicast "PIM" is there advantages och disadvantages there? I see also you are using "hardware access-list tcam region racl 512" do i need to free up TCAM if i am using 93180YC-EX or this plattform can handle that whithout any TCAM allocation?
Thanks again.
Hi Mohammad!
DeleteThanks for the excellent questions!
If we ping from the Leaf-101 by using VLAN 10 Anycast Gateway address (AGW) 192.168.11.1 as a source address to Hots-Café 192.168.11.11 in VLAN 10 which is located behind the Leaf-102 and Leaf-103 (which both have the same IP 192.168.11.1 for VLAN 10 AGW). The source IP address used in the icmp-request packets is 192.168.11.1 and when Host Café sends the icmp-reply, it will send it towards 192.168.11.1. The Ethernet switch may send the packet either to Leaf-102 or Leaf-103. Since both Leaf switches “owns” the IP address 192.168.11.1, they will not send icmp-reply back to querier Leaf-101. So we will not get an icmp-reply for icmp-request.
In the second scenario where link e1/3 in Leaf-103 is down and we ping from it by using AGW 102.168.11.1 as a source to Host Café 192.168.11.1, icmp-request will be sent over the peer-link to Leaf-102, which in turns forwards packets to the host Cafe over the link 1/3. Host Café sends the ICMP-reply towards Leaf-102 (only possible path) by using destination IP 192.168.11.1. When Leaf-102 receives the packet it will not forward it to Leaf-103 since it “owns” the destination IP 192.168.11.1. By the way, even though the link between Leaf-103 and Ethernet switch is up, we might end up to the same situation. This depends on which path is selected by Channel Hash algorithm. So in the case of vPC, we are not able to predict whether the ping works or not.
If you only have a couple of switches, you could use ingress replication instead Multicast in Underlay network for BUM traffic.
Hi Toni!
DeleteThanks your respons!
If i understand you right then it means, the ping from leafs don´t work att all. Good. Is there anyway to test a ping from the switch to some host which is connected to the remote leafs?
"if you only have a couple of switches" Is there any limitations of switches? I do have leaf-1/2 vpc and leaf-3/4 vpc and two spine swicthes.
I am trying to use SPINE-border is there any disadvantage or advantage to use SPINEs like a border?
Thanks
First, there are Operating, Administrating and Management (OAM) model where we have advanced tools for monitoring and troubleshooting (NVE ping, path trace among the other things). I will write a post about OAM later (I have one topic before that).
DeleteSecond, if the SPINE switch is used as a border node, it will become a VTEP (in addition to the SPINE). There will naturally be external peers and we might have additional Control Plane protocols such as MPLS LDP, some dynamic routing protocol, etc. required on SPINE/Border switch. This adds complexity to the SPINE (increase OPEX). In turn, by using SPINE as a border node, you can have savings in CAPEX point of view.
Hi Toni!
DeleteOk, Thanks. Just waiting to OAM, it is fantastic to heard that you are planning about this. I am very thank full your work, thank you very much.
What is the question "If you only have a couple of switches, you could use ingress replication instead Multicast in Underlay network for BUM traffic" Is there any limitations of switches? I do have leaf-1/2-vpc and leaf-3/4-vpc and two spine swicthes.
Thank you ver much for your time.
Unknown16 November 2018 at 23:36
DeleteA little update: I was missing ip pim rp-address [anycast_rp_addr] on Spines.
All is working perfectly now.
A little update: I was missing ip pim rp-address [anycast_rp_addr] on Spines.
ReplyDeleteAll is working perfectly now.
I have faced interesting problem when up links from Leaf 102 OR Leaf 103 ( vPC pair) go down - spanning-tree on switch with Po10 is blocking port-channel with a message : " Desg BLK 4 128.67 P2p Dispute"
ReplyDeleteHave anyone seen this before?
Great posts Toni, really detailed and educational. Quick question, on VIRL what kind of host are you using ? I am using the Ubuntu server but for a reason I cannot ping my default gateway. Am I missing something here? I have configured my dgw and ip address on the server.
ReplyDeleteThanks,
George
Hi George,
DeleteI am using a router as a client. For the quick check, you can verify that the STP root is in VTEP switch.
Toni
Hi Toni, Thanks for your reply. I have managed to add the ubuntu hosts into the fabric and now I can see that I have learnt both mac and IP host addresses. Quick one, did you notice while on VIRL that the BGP sessions between leaf and Spines remain idle sometimes and I had to clear the sessions. Anyway, seriously , you have done a really great job here. I liked your explanation on ESI and the algorithm regarding the DF election.
ReplyDeleteThanks again.
Thanks George for your kind comments. And yes, virtual devices on VIRL has to sometimes boot to make things work. In my case problems are related to L2IOS vlan database.
DeleteCheers-Toni
Hi Toni,
ReplyDeleteYour blog is the lifespring providing me cisco vxlan knowledge I need!!!
I am actually read your blog over and over again in order to better understand it.
for VPC I noticed you are using a secondary IP(192.168.100.23) on loopback 100 on both VPC devices.
I believe on spine switch, it point 192.168.100.23 as its bgp peer.
While I have some problems here:
1.is this a cisco recommanded config here, I mean using secondary IP.
2.what is the mac address spine switch is using in order to forward packet to 192.168.100.23.
3.if it is me, I will use HSRP on VPC as it can be active active as well.
4.I am actually confused by this config actually. since eth1 is not port channal and loopback 100 is not hsrp, won't this config cause IP conflict?
Yours sincerely
Michael
Hi Michael,
DeleteExcellent questions one again!
1) I am using a couple of Cisco documents as a source in this chapter but since the vPC+ (no peer/keepalive link required) is now available, I am not 100% sure what is their current best practice for doing this.
2) The MAC address that Spine-11 uses depends on the result of the ECMP hash. If the result points to Leaf-101, the MAC of its core interface is used and if Leaf-102 is selected, then its core interface MAC is used. The Router-MAC BGP Extended Community Path Attribute (PA) carried in BGP Update depends on the vPC configuration. If we are using "advertise virtual-rmac", then the virtual MAC is used instead of vPC system MAC.
vPC peers use the same VIP. This way they are seen by remote leafs as one unit. This is the same kind of model than what HSRP shows to the LAN side. There is no IP conflict.
The BGP peering model does not change here, the BGP L2VPN EVPN Afi peering is still between Loopback 77 interfaces (192.168.77.sw-id/32).
If I ever rewrite this chapter, I will include these in it.
Thanks - Toni
HI Tony,
ReplyDeleteReally appreciate your patience.
Sorry I ignore the time when you updated this blog. By the time, VPC+ is not launched yet.
I have to make a sigh here. Technology advanced in such a rapid speed. Moore's Law works in network as well.
BTW what software are you using to draw these diagrams. they look awesome!!!
Yours sincerely
Michael
hi Toni,
Deleteanother question just comes in my mind.(the further you extend your knowledge scope, the more question you will have!) and please allow me to touble you again.
in this blog, you are using the same secondary IP on leaf 102 and leaf 103 for form a bgp peer with spine, and when cafe join the network both leaf102 and leaf103 will forward type 2 including cafe's mac and ip to spine,
let us assume we are using vpc+ here, leaf102 and leaf 103 will form a active/active hsrp. so in this case, when cafe comes in network. Will both leaf102 and leaf103 send type2 to spine OR only one of them sends update?
As far as I can understand, since VPC+ will use VIP and VMAC for HSRP and include them in type2, one update should be enough. and if this is true, using vpc+ in this senario should be better as it reduce a second type2 to be sent.
Regards
Michael
Hi Michael,
DeleteKnowledge sharing is the reason for this block, so please be welcome to ask questions (for sure I can not answer them all but I do my best). I haven't been able to test vPC+ yet with real physical device and it is not supported in NX-OSv 9.3.1 that I am currently using. I have my assumptions how it works but as long as I do not have exact information I'll be quiet :)
Hi Michael,
DeleteThe colored figures are made with PowerPoint and the black-and-white figures I am using in newer posts are made with MS Visio. The Icons are self made.
Cheers - Toni
thanks for posting this lab, im about to deploy vxlan. coming from a VPC+ environment, id like to VPC 2 of my 4 leafs so i can connect downstream L2 switches for for port density. my question is...
Delete1 will HSRP bring any advantage vs anycast gateway?
2 is the VPC primary responding to any ARP request from cafe/beef clients and VPC secondary is just standing by?
Great posts Toni, Thanks for sharing your awesome knowledge..
ReplyDeleteThanks, I have not have time to write new posts because I have been quite busy with my VXLAN book project. Now when it is finally complete, I try to find time to start writing again :)
DeleteHello Toni,
ReplyDeleteThat was very educational. I am trying to impelement a similar topology. I use 2x 9396 in a vPC pair. I have an orphan host in LEAF_A and the external connectivity in LEAF_B. Between the vPC pair I have L3 underlay connectivity using a physical L3 port. I am trying to ping the external network from host in LEAF_A but I am getting "ttl expired in transit" message. The default route is installed to LEAF_B and is propagated to LEAF_A. Can you provide any insights on it?
Hi, It looks like you have routing loop in your environment. Have you checked this blog post https://nwktimes.blogspot.com/2018/09/vxlan-part-xi-using-vpc-peer-link-as.html that describes how VPC peer-link is used as Underlay Network backup path. Note that Cisco recommends that orphan hosts should be connected to primary VPC peer.
DeleteHi, Thank you for your answer. I actually read this article. In my case (using Cisco 9396 switches) I cannot find the command 'system nve infra-vlans'. Do you know if similar command exists for these switches? Also, I don't use vPC peer link for L3 connectivity. I am using physical interfaces. If you want, I can sent you command outputs and configuration sections. Thank you!
DeleteThis comment has been removed by the author.
ReplyDeletegreat info CISCO Meraki Switches Firewall
ReplyDeleteHello Tony,
ReplyDeleteI am Jignesh, I work in mobitv as Sr. Network Engg. I recently started reading your blog and I really like it (https://nwktimes.blogspot.com/). I also order book. I don't have any way to ask question to you. Do you have any info or answer for below question. If you already have answers in your blog than i have not read all blog yet and if you have not answer i would love to know all following things or if you can write articles for us. it would be great.
Question Follows:
In ARP supression Mode Under VXLAN:
A- Unicast ARP reply keep alive mechanisma between be dead to check host liveness (window use it)
B- how DHCP relay work in VXLAN and does it learn MAC and IP mapping in L2FWD VXLAN?
C- how duplicate IP for endhost will check in static and dynamic(dhcp).
D- do you think proxy arp will be disable automatically if you enable arp suppresssion?
d- it never needed beacuse of anycast gateway and spine-leaf architecture.
E- do you think Layer-2 Protocol like CDP, LLDP will work without any issue? Does it required any special Route type?
I already have drop message on your linkedln. I am sorry if i ask questions which look like dummy.
thank you!- Jignesh
This comment has been removed by the author.
ReplyDeleteHi!
ReplyDeleteGreat post, I am just wondering if leaf-102 and leaf-103 do not need to peer with BGP as they are in a vpc configuration (BGP backup session on SVI) : https://www.cisco.com/c/en/us/support/docs/switches/nexus-9000-series-switches/214624-configure-system-nve-infra-vlans-in-vxla.html
Do you have an idea ?
In vPC i don't think they need any BGP peering, they will be act like single switch for other VTEP.
DeleteThat's right, a BGP peering is not configured between vPC members. In design, where peer link is used as an underlay backup link, the routing protocol used in an underlay should be enabled (IGP or BGP). If we have Multicast enabled Underlay network, also PIM should be enabled.
DeleteVip full form
ReplyDeleteVip full form
Toni,
ReplyDeleteYou are doing great work, i bought your books also. Thanks for your work. I have question. I build Spine-Leaf network but my requirement is to have all L2VNI (because i have cisco asa firewall gateway of all my VLANs in network). In that case can i enabled "suppress-arp" for ARP-Suppression? Cisco saying you can only enable arp-suppression with L3VNI where you have Anycast Gateway. is that true?
I'm not sure why they are saying that because even though when there is no AGW configured for L2VN and the fabric is used only as L2 transit network, there still is MAC-IP NLRI carried in EVPN RT 2 (Mac Advertisement). I checked this from by book on page 234 where there is a BGP entry having both MAC and IP addreess information about host abba in VLAN 30 which uses fabric as L2 transit. However, I am not sure if that information is actually installed in ARP supression cache if there is no AGW for that L2VN. That might be the reason for not enabling ARP-Suppression for L2 transit network. You could verify that by using command "show ip arp suppression-cache detail" and checking if there is ARP cache entry. Check also the BGP table is there is both MAC-only and MAC-IP entries about your hosts.
DeleteHi Tony, if the vPC border leafs also work as border gateway in multisite setup with point to point connections between the border gateways in two DCs then if the DCI on one of the border leaf go down then how an orphan port on DCI down border leaf will communicate with rest of the fabric?
ReplyDeleteThank you