Wednesday 19 June 2019

EVPN ESI Multihoming Part III: Data Flows and link failures


Now you can also download my VXLAN book from the Leanpub.com 

"Virtual Extensible LAN VXLAN - A Practical guide to VXLAN Solution Part 1. (373 pages)

This chapter explains the EVPN ESI Multihoming data flows. The first section explains the Intra-VNI flows (L2VNI) Unicast traffic and Second section introduces BUM traffic. Figure 1-1 shows the topology and addressing schemes used in this chapter. Complete configurations of Leaf-102 and Leaf-103 can be found at the end of the document.



Figure 1-1: Topology an addressing scheme.



Introduction

Examples 1-1, 1-2, and 1-3 show the MAC address tables of leaf switches in a stable situation where all inter-switch links are up. In order to generate data flows to network, Hosts Abba and Beef sends ICMP requests to all hosts (Abba/Beef/Bebe/Face/Cafe) and to VLAN 10 AGW address in every five seconds. Note that the MAC address table is updated based on current data flow. This means that if e.g. host Abba has only one data flow in time T1, the LACP hash algorithm might choose to use the only link to Leaf-102. This means that Leaf-103 learns the MAC address only via BGP from Spine-11.

The data path between host Abba and Beef use optimal path either via Leaf-102 or via Leaf-103 depending on an LACP hashing algorithm result in ASW-104 and ASW-105. When considering data path from Abba to orphan host Bebe, the optimal path is via Leaf-102 and sub-optimal path via Leaf-103 > Spine-11 > Leaf-102. This depends on the result of the LACP hash algorithm. The same rule applies to data paths between host Abba to Face and from Beef to either orphan hosts. Return traffic from orphan hosts will use the optimal path. Data paths from host Cafe to Abba and Beef can be load balanced/per flow.

Leaf-102# sh sys int l2fwder mac | i abba|beef|bebe|face|cafe
*    10    1000.0010.cafe    static   -          F     F  nve-peer2 192.168.100.101
*    10    1000.0010.beef   dynamic   02:02:15   F     F      Po235
*    10    1000.0010.abba   dynamic   02:02:17   F     F      Po234
*    10    1000.0010.bebe   dynamic   02:02:17   F     F     Eth1/3
*    10    1000.0010.face    static   -          F     F  nve-peer1 192.168.100.103
Example 1-1: MAC address table of Leaf-102.

Leaf-103#  sh sys int l2fwder mac | i abba|beef|bebe|face|cafe
*    10    1000.0010.cafe    static   -          F     F  nve-peer2 192.168.100.101
*    10    1000.0010.beef   dynamic   02:03:10   F     F      Po235
*    10    1000.0010.abba   dynamic   02:03:12   F     F      Po234
*    10    1000.0010.bebe    static   -          F     F  nve-peer1 192.168.100.102
*    10    1000.0010.face   dynamic   02:03:11   F     F     Eth1/3
Example 1-2: MAC address table of Leaf-103.

Leaf-101# sh sys int l2fwder mac | i abba|beef|bebe|face|cafe
*    10    1000.0010.cafe   dynamic   02:27:47   F     F     Eth1/3
*    10    1000.0010.beef    static   -          F     F  nve-peer2 192.168.100.102
*    10    1000.0010.abba    static   -          F     F  nve-peer1 192.168.100.103
*    10    1000.0010.bebe    static   -          F     F  nve-peer2 192.168.100.102
*    10    1000.0010.face    static   -          F     F  nve-peer1 192.168.100.103
Example 1-3: MAC address table of Leaf-101.

L2RIB of Leaf-102 shows that MAC addresses of local hosts Abba (ES 1234) and Beef (ES4321) are learned from connected Port-Channels and in addition from the redundancy group peer switch Leaf-103 via BGP.

Leaf-102# sh l2route mac all | i abba|beef|bebe|face|cafe
10          1000.0010.abba Local  L,Dup,PF,     39         Po234
10          1000.0010.abba BGP    Dup,PF,SplRcv 38         192.168.100.103
10          1000.0010.bebe Local  L,            0          Eth1/3
10          1000.0010.beef Local  L,Dup,PF,     38         Po235
10          1000.0010.beef BGP    Dup,PF,SplRcv 37         192.168.100.103
10          1000.0010.cafe BGP    SplRcv        0          192.168.100.101
10          1000.0010.face BGP    SplRcv        0          192.168.100.103
Example 1-4: L2RIB on Leaf-102.



Intra-VNI (L2VNI): Unicast Traffic

Scenario 1: Link E1/2 down on Leaf-102

Figure 1-2: Link failure on Leaf-102.


When inter-switch link e1/4 connected to ASW-105 goes down on Leaf-102 (figure 1-2) it has to remove itself from the redundancy group used for ESI: 0301.0201.0302.3400.10e1 by withdrawing an Ethernet Segment Route (BGP EVPN Route-Type 4). This withdrawn message is only processed by Leaf-103 on the same redundancy group (based on Route-Target). When Leaf-103 receives the message, it updates the Ethernet Segment information. Example 1-5 shows that Leaf-103 is now Designated Forwarder (DF) for all active VLANs on ESI: 0301.0201.0302.3400.10e1 while Leaf-102 is still DF for VLAN 10 in ESI 0301.0201.0302.3400.04d2.


Leaf-103# sh nve ethernet-segment
ESI: 0301.0201.0302.3400.04d2
   Parent interface: port-channel234
  ES State: Up
  Port-channel state: Up
  NVE Interface: nve1
   NVE State: Up
   Host Learning Mode: control-plane
  Active Vlans: 10-11
   DF Vlans: 11
   Active VNIs: 10000-10001
  CC failed for VLANs:
  VLAN CC timer: 0
  Number of ES members: 2
  My ordinal: 1
  DF timer start time: 00:00:00
  Config State: config-applied
  DF List: 192.168.100.102 192.168.100.103
  ES route added to L2RIB: True
  EAD/ES routes added to L2RIB: True
  EAD/EVI route timer age: not running
----------------------------------------

ESI: 0301.0201.0302.3400.10e1
   Parent interface: port-channel235
  ES State: Up
  Port-channel state: Up
  NVE Interface: nve1
   NVE State: Up
   Host Learning Mode: control-plane
  Active Vlans: 10-11
   DF Vlans: 10-11
   Active VNIs: 10000-10001
  CC failed for VLANs:
  VLAN CC timer: 0
  Number of ES members: 1
  My ordinal: 0
  DF timer start time: 00:00:00
  Config State: config-applied
  DF List: 192.168.100.103
  ES route added to L2RIB: True
  EAD/ES routes added to L2RIB: True
  EAD/EVI route timer age: not running
----------------------------------------
Leaf-103#
Example 1-5: ES information on Leaf-103.

In addition, Leaf-102 has to inform remote leafs that it cannot be used for load balancing purpose concerning destination MAC addresses that are advertised with ESI: 0301.0201.0302.3400.10e1. This is done by using the Ethernet A-D ES route (BGP EVP Route-Type 1). Leaf-102 also withdrawn all the MAC addresses participating in VNI 10 (VLAN 10) and VNI 11 (VLAN 11) that has ESI: 0301.0201.0302.3400.10e1 value attaches to it by using Ethernet A-D EVI/ES route. This process is called mass withdrawn. As a reaction to the message, Leaf-101 updates the next hop information.
Example 1-6 shows the BGP table of Leaf-101 before withdrawn messages and the example 1-7 after withdrawn message. ESI: 0301.0201.0302.3400.10e1 and host Beef is only advertised by Leaf-103.


Leaf-101# sh bgp l2vpn evpn vni-id 10000
<snipped>
Route Distinguisher: 192.168.77.101:32777    (L2VNI 10000)
*>i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
                      192.168.100.102                   100          0 i
*|i                   192.168.100.103                   100          0 i
*|i[1]:[0301.0201.0302.3400.10e1]:[0x0]/152
                      192.168.100.103                   100          0 i
*>i                   192.168.100.102                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216
                      192.168.100.103                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.bebe]:[0]:[0.0.0.0]/216
                      192.168.100.102                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.beef]:[0]:[0.0.0.0]/216
                      192.168.100.102                   100          0 i
*>l[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216
                      192.168.100.101                   100      32768 i
*>i[2]:[0]:[0]:[48]:[1000.0010.face]:[0]:[0.0.0.0]/216
                      192.168.100.103                   100          0 i
*|i[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.104]/272
                      192.168.100.103                   100          0 i
*>i                   192.168.100.102                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.bebe]:[32]:[172.16.10.102]/272
                      192.168.100.102                   100          0 i
*|i[2]:[0]:[0]:[48]:[1000.0010.beef]:[32]:[172.16.10.105]/272
                      192.168.100.103                   100          0 i
*>i                   192.168.100.102                   100          0 i
*>l[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272
                      192.168.100.101                   100      32768 i
*>i[2]:[0]:[0]:[48]:[1000.0010.face]:[32]:[172.16.10.103]/272
                      192.168.100.103                 
Example 1-6: BGP table on Leaf-101 before withdrawn.


Leaf-101# sh bgp l2vpn evpn vni-id 10000
<snipped>
Route Distinguisher: 192.168.77.101:32777    (L2VNI 10000)
*>i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
                      192.168.100.102                   100          0 i
*|i                   192.168.100.103                   100          0 i
*>i[1]:[0301.0201.0302.3400.10e1]:[0x0]/152
                      192.168.100.103                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216
                      192.168.100.103                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.bebe]:[0]:[0.0.0.0]/216
                      192.168.100.102                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.beef]:[0]:[0.0.0.0]/216
                      192.168.100.103                   100          0 i
*>l[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216
                      192.168.100.101                   100      32768 i
*>i[2]:[0]:[0]:[48]:[1000.0010.face]:[0]:[0.0.0.0]/216
                      192.168.100.103                   100          0 i
*|i[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.104]/272
                      192.168.100.103                   100          0 i
*>i                   192.168.100.102                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.bebe]:[32]:[172.16.10.102]/272
                      192.168.100.102                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.beef]:[32]:[172.16.10.105]/272
                      192.168.100.103                   100          0 i
*>l[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272
                      192.168.100.101                   100      32768 i
*>i[2]:[0]:[0]:[48]:[1000.0010.face]:[32]:[172.16.10.103]/272
                      192.168.100.103                   100          0 i
Example 1-7: BGP table on Leaf-101 after withdrawn.


As the last step, Leaf-102 updates its own MAC address table. Now host Beef is only learned from Leaf-103 via BGP.

Leaf-102# sh sys int l2fwd mac | i abba|beef|bebe|face|cafe
*    10    1000.0010.cafe    static   -          F     F  nve-peer2 192.168.100.101
*    10    1000.0010.beef    static   -          F     F  nve-peer1 192.168.100.103
*    10    1000.0010.abba   dynamic   03:12:26   F     F      Po234
*    10    1000.0010.bebe   dynamic   03:12:26   F     F     Eth1/3
*    10    1000.0010.face    static   -          F     F  nve-peer1 192.168.100.103
Example 1-8: BGP table on Leaf-102 after link failure.

Scenario 2: Core link down on Leaf-102.

 Figure 1-3: Core Link failure of Leaf-102.

In the case where Leaf-102 loses all of its’ core links, it restricts itself from the network by shutting down all the links that participate in any Ethernet Segment. The result is that also orphan host Bebe is restricted from the network even though it's uplink interface e1/3 stays up. traffic between Abba, Beef, Face, and Cafe are now switched via Leaf-103.


Leaf-102# sh interface port-channel 234 | i 234
port-channel234 is down (NVE core link down)

Leaf-102# sh interface port-channel 235 | i 235
port-channel235 is down (NVE core link down)
Example 1-9: Core link failure on Leaf-102.

Examples 1-10 and 1-11 shows that orphan host Bebe is removed from both MAC address table and L2RIB by Leaf-103.


Leaf-103# sh sys int l2fwd mac | i abba|beef|bebe|face|cafe
*    10    1000.0010.cafe    static   -          F     F  nve-peer2 192.168.100.101
*    10    1000.0010.beef   dynamic   00:03:51   F     F      Po235
*    10    1000.0010.abba   dynamic   00:02:04   F     F      Po234
*    10    1000.0010.face   dynamic   03:35:35   F     F     Eth1/3
Example 1-10: MAC address table of Leaf-103.


Leaf-103# sh l2route evpn mac all | i abba|beef|bebe|face|cafe
10          1000.0010.abba Local  L,            0          Po234
10          1000.0010.beef Local  L,            0          Po235
10          1000.0010.cafe BGP    SplRcv        0          192.168.100.101
10          1000.0010.face Local  L,            0          Eth1/3
Example 1-11: L2RIB of Leaf-103.


Intra-VNI (L2VNI): Broadcast, Unknown Unicast and Multicast (BUM) traffic


Scenario 1: Traffic flow from Designated Forwarder

Figure 1-4 illustrates the situation where host Abba in vlan 10 sends an Ethernet frame with an unknown destination MAC address. The LACP hash algorithm of Leaf-104 forwards the frame out of the interface e1/1 to Leaf-102 (1). Leaf-102 forwards frame out of the e1/3 to the orphan host Bebe (2). Because Leaf-102 is Designated Forwarder (DR) for VLAN 10, it also forwards the frame to the Ethernet Segments with ESI: 0301.0201.0302.3400.10e1 where vlan 10 is allowed (3). For BUM traffic, all leaf switches use Multicast group 228.0.0.10 where Spine-11 is Designated Router (DR). Leaf-102 encapsulates the Ethernet frame with new Ethernet/IP (238.0.0.10) /UDP/VXLAN headers and sends it to Spine-11 (4). Spine-11 receives the BUM frame with the destination IP 238.0.0.10. It forwards frames based on 238.0.0.10 Outgoing Interface List (OIL) to Leaf-101 and Leaf-103 (5). Leaf-101 forwards frames to host Cafe. Leaf-103 receives the frames and decapsulates it. Based on VXLAN header VNI 10000 value, Leaf-103 knows that the frame needs to be forwarded to vlan 10 and it forwards frames out to host Face (6). Vlan 10 is active in the Ethernet Segments with ESI: 0301.0201.0302.3400.10e1 and with ESI: 0301.0201.0302.3400.04d2, however Leaf-103 does not forward frames received from remote leaf to ES where it is not Designated Forwarder (7-8). 


Figure 1-4: BUM from DF perspective.

Scenario 2: Traffic flow from non-Designated Forwarder

Figure 1-4 illustrates the situation where host Abba in vlan 10 sends an Ethernet frame with unknown destination MAC address. The LACP hash algorithm of Leaf-104 now forwards the frame out of the interface e1/2 to Leaf-103 (1). Leaf-103 forwards frame out of the e1/3 to the orphan host Face (2). In addition, Leaf-103 forwards the frame received from local ESI: 0301.0201.0302.3400.04d2 to the local ESI: 0301.0201.0302.3400.10e1 where vlan 10 is allowed (3). Leaf-103 encapsulates the Ethernet frame with new Ethernet/IP (238.0.0.10) /UDP/VXLAN headers and sends it to Spine-11 (4). Spine-11 receives the BUM frame with the destination IP 238.0.0.10. It forwards frames based on 238.0.0.10 Outgoing Interface List (OIL) to Leaf-101 and Leaf-102 (5). Leaf-101 forwards frames to host Cafe. Leaf-102 receives the frames and decapsulates it. Based on VXLAN header VNI 10000 value, Leaf-102 knows that the frame needs to be forwarded to vlan 10 and it forwards frames out to host Bebe (6). Vlan 10 is active in the Ethernet Segments with ESI: 0301.0201.0302.3400.10e1 and with ESI: 0301.0201.0302.3400.04d2, however even though Leaf-102 is the DF for VLAN 10 it does not forward frames received from remote leaf to either local ES (7-8).


Figure 1-5: BUM from non-DF perspective.


Scenario 3: Traffic flow from non-Designated Forwarder

Figure 1-4 illustrates the situation where the host Cafe in vlan 10 sends an Ethernet frame with unknown destination MAC address. Leaf-101 encapsulates the Ethernet frame with new Ethernet/IP (238.0.0.10) /UDP/VXLAN headers and sends it to Spine-11 (1). Spine-11 receives the BUM frame with the destination IP 238.0.0.10. It forwards frames based on 238.0.0.10 Outgoing Interface List (OIL) to Leaf-102 and Leaf-103 (2). Leaf-102 and Leaf 103 receives the frame and decapsulates it and forwards frame to local orphan host Bebe and Face (3, 6). Vlan 10 is active in the Ethernet Segments with ESI: 0301.0201.0302.3400.10e1 and with ESI: 0301.0201.0302.3400.04d2, Leaf-102 is the DF for VLAN 10 and it forwards frames received from remote leaf to the ESs. Leaf-103 (non-DF for vlan 10) does not forward frames received from remote leaf to either local ES (7-8).



Figure 1-6: BUM from remote leaf perspective.

Complete configurations of Leaf-102 and Leaf-103.

Leaf-102# sh run

!Command: show running-config
!No configuration change since last restart
!Time: Tue Jul  2 09:57:44 2019

version 9.2(3) Bios:version
hostname Leaf-102
vdc Leaf-102 id 1
  limit-resource vlan minimum 16 maximum 4094
  limit-resource vrf minimum 2 maximum 4096
  limit-resource port-channel minimum 0 maximum 511
  limit-resource u4route-mem minimum 248 maximum 248
  limit-resource u6route-mem minimum 96 maximum 96
  limit-resource m4route-mem minimum 58 maximum 58
  limit-resource m6route-mem minimum 8 maximum 8

cfs ipv4 mcast-address 239.102.103.10
cfs ipv4 distribute
nv overlay evpn
feature ospf
feature bgp
feature pim
feature fabric forwarding
feature interface-vlan
feature vn-segment-vlan-based
feature lacp
feature nv overlay

username admin password 5 $5$D5ihLQiw$JdrHtd6/cvkXGFI8NRyi7yIeUMO1Vg8HKJHz3wVCpA
5  role network-admin
ip domain-lookup
ip host Leaf-103 10.0.0.103
spanning-tree mode mst
copp profile strict
evpn esi multihoming
  ethernet-segment delay-restore time 180
  vlan-consistency-check
snmp-server user admin network-admin auth md5 0x9bcc18427d4176f2aec8419a200a8bbf
 priv 0x9bcc18427d4176f2aec8419a200a8bbf localizedkey
rmon event 1 description FATAL(1) owner PMON@FATAL
rmon event 2 description CRITICAL(2) owner PMON@CRITICAL
rmon event 3 description ERROR(3) owner PMON@ERROR
rmon event 4 description WARNING(4) owner PMON@WARNING
rmon event 5 description INFORMATION(5) owner PMON@INFO

fabric forwarding anycast-gateway-mac 0001.0001.0001
ip pim rp-address 192.168.238.1 group-list 238.0.0.0/24 bidir
ip pim ssm range 232.0.0.0/8
vlan 1,10-11,20,77
vlan 10
  name L2VNI-for-VLAN10
  vn-segment 10000
vlan 11
  vn-segment 10001
vlan 20
  name L2VNI-for-VLAN20
  vn-segment 20000
vlan 77
  name TENANT77
  vn-segment 10077

spanning-tree domain enable
spanning-tree mst 0-1 priority 8192
spanning-tree vlan 10 priority 8192
spanning-tree mst configuration
  name VXLAN-Fabric
  instance 1 vlan 10-11
vrf context TENANT77
  vni 10077
  rd auto
  address-family ipv4 unicast
    route-target both auto
    route-target both auto evpn
vrf context management
hardware access-list tcam region racl 512
hardware access-list tcam region vpc-convergence 256
hardware access-list tcam region arp-ether 256 double-wide


interface Vlan1

interface Vlan10
  no shutdown
  vrf member TENANT77
  ip address 172.16.10.1/24
  fabric forwarding mode anycast-gateway

interface Vlan11
  no shutdown
  vrf member TENANT77
  ip address 172.16.11.1/24
  fabric forwarding mode anycast-gateway

interface Vlan77
  no shutdown
  vrf member TENANT77
  ip forward

interface port-channel234
  switchport mode trunk
  switchport trunk allowed vlan 10-11
  ethernet-segment 1234
    system-mac 0102.0103.0234

interface port-channel235
  switchport mode trunk
  switchport trunk allowed vlan 10-11
  ethernet-segment 4321
    system-mac 0102.0103.0234

interface nve1
  no shutdown
  host-reachability protocol bgp
  source-interface loopback100
  member vni 10000
    suppress-arp
    mcast-group 238.0.0.10
  member vni 10001
    suppress-arp
    mcast-group 238.0.0.10
  member vni 10077 associate-vrf
  member vni 20000
    suppress-arp
    mcast-group 238.0.0.10

interface Ethernet1/1
  no switchport
  evpn multihoming core-tracking
  mac-address 1eaf.0102.1e11
  medium p2p
  ip address 10.102.11.102/24
  ip ospf network point-to-point
  ip router ospf UNDERLAY-NET area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface Ethernet1/2
  switchport mode trunk
  switchport trunk allowed vlan 10-11
  channel-group 234 mode active

interface Ethernet1/3
  switchport access vlan 10

interface Ethernet1/4
  switchport mode trunk
  switchport trunk allowed vlan 10-11
  channel-group 235 mode active


interface mgmt0
  vrf member management
  ip address 10.0.0.102/24

interface loopback0
  description ** RID/Underlay **
  ip address 192.168.0.102/32
  ip router ospf UNDERLAY-NET area 0.0.0.0
  ip pim sparse-mode

interface loopback77
  description ** BGP peering **
  ip address 192.168.77.102/32
  ip router ospf UNDERLAY-NET area 0.0.0.0

interface loopback100
  description ** VTEP/Overlay **
  ip address 192.168.100.102/32
  ip router ospf UNDERLAY-NET area 0.0.0.0
  ip pim sparse-mode
line console
line vty
boot nxos bootflash:/nxos.9.2.3.bin
router ospf UNDERLAY-NET
  router-id 192.168.0.102
router bgp 65000
  router-id 192.168.77.102
  address-family ipv4 unicast
  address-family l2vpn evpn
    maximum-paths 2
    maximum-paths ibgp 2
  neighbor 192.168.77.11
    remote-as 65000
    description ** Spine-11 BGP-RR **
    update-source loopback77
    address-family l2vpn evpn
      send-community extended
  vrf TENANT77
    address-family ipv4 unicast
      advertise l2vpn evpn
evpn
  vni 10000 l2
    rd auto
    route-target import auto
    route-target export auto
  vni 10001 l2
    rd auto
    route-target import auto
    route-target export auto
  vni 20000 l2
    rd auto
    route-target import auto
    route-target export auto

no logging console




Leaf-103# sh run

!Command: show running-config
!No configuration change since last restart
!Time: Tue Jul  2 10:00:30 2019

version 9.2(3) Bios:version
hostname Leaf-103
vdc Leaf-103 id 1
  limit-resource vlan minimum 16 maximum 4094
  limit-resource vrf minimum 2 maximum 4096
  limit-resource port-channel minimum 0 maximum 511
  limit-resource u4route-mem minimum 248 maximum 248
  limit-resource u6route-mem minimum 96 maximum 96
  limit-resource m4route-mem minimum 58 maximum 58
  limit-resource m6route-mem minimum 8 maximum 8

cfs ipv4 mcast-address 239.102.103.10
cfs ipv4 distribute
nv overlay evpn
feature ospf
feature bgp
feature pim
feature fabric forwarding
feature interface-vlan
feature vn-segment-vlan-based
feature lacp
feature nv overlay

username admin password 5 $5$sj/fUoxA$nbqlg/We.ipAnPnY/oRfctxwO7JvleM73s7CGR6I6F
8  role network-admin
ip domain-lookup
spanning-tree mode mst
copp profile strict
evpn esi multihoming
  ethernet-segment delay-restore time 180
  vlan-consistency-check
snmp-server user admin network-admin auth md5 0x423cb9002003f0f3c3acb917bba00bf8
 priv 0x423cb9002003f0f3c3acb917bba00bf8 localizedkey
rmon event 1 description FATAL(1) owner PMON@FATAL
rmon event 2 description CRITICAL(2) owner PMON@CRITICAL
rmon event 3 description ERROR(3) owner PMON@ERROR
rmon event 4 description WARNING(4) owner PMON@WARNING
rmon event 5 description INFORMATION(5) owner PMON@INFO

fabric forwarding anycast-gateway-mac 0001.0001.0001
ip pim rp-address 192.168.238.1 group-list 238.0.0.0/24 bidir
ip pim ssm range 232.0.0.0/8
vlan 1,10-11,20,77,111
vlan 10
  name L2VNI-for-VLAN10
  vn-segment 10000
vlan 11
  vn-segment 10001
vlan 20
  name L2VNI-for-VLAN20
  vn-segment 20000
vlan 77
  name TENANT77
  vn-segment 10077

spanning-tree domain enable
spanning-tree mst 0-1 priority 8192
spanning-tree vlan 10 priority 8192
spanning-tree mst configuration
  name VXLAN-Fabric
  instance 1 vlan 10-11
vrf context TENANT77
  vni 10077
  rd auto
  address-family ipv4 unicast
    route-target both auto
    route-target both auto evpn
vrf context management
hardware access-list tcam region racl 512
hardware access-list tcam region vpc-convergence 256
hardware access-list tcam region arp-ether 256 double-wide


interface Vlan1

interface Vlan10
  no shutdown
  vrf member TENANT77
  ip address 172.16.10.1/24
  fabric forwarding mode anycast-gateway

interface Vlan11
  no shutdown
  vrf member TENANT77
  ip address 172.16.11.1/24
  fabric forwarding mode anycast-gateway

interface Vlan77
  no shutdown
  vrf member TENANT77
  ip forward

interface port-channel234
  switchport mode trunk
  switchport trunk allowed vlan 10-11
  ethernet-segment 1234
    system-mac 0102.0103.0234

interface port-channel235
  switchport mode trunk
  switchport trunk allowed vlan 10-11
  ethernet-segment 4321
    system-mac 0102.0103.0234

interface nve1
  no shutdown
  host-reachability protocol bgp
  source-interface loopback100
  member vni 10000
    suppress-arp
    mcast-group 238.0.0.10
  member vni 10001
    suppress-arp
    mcast-group 238.0.0.10
  member vni 10077 associate-vrf
  member vni 20000
    suppress-arp
    mcast-group 238.0.0.10

interface Ethernet1/1
  no switchport
  evpn multihoming core-tracking
  mac-address 1eaf.0103.1e11
  medium p2p
  ip address 10.103.11.103/24
  ip ospf network point-to-point
  ip router ospf UNDERLAY-NET area 0.0.0.0
  ip pim sparse-mode
  no shutdown

interface Ethernet1/2
  switchport mode trunk
  switchport trunk allowed vlan 10-11
  channel-group 234 mode active

interface Ethernet1/3
  switchport access vlan 10

interface Ethernet1/4
  switchport mode trunk
  switchport trunk allowed vlan 10-11
  channel-group 235 mode active


interface mgmt0
  vrf member management
  ip address 10.0.0.103/24

interface loopback0
  description ** RID/Underlay **
  ip address 192.168.0.103/32
  ip router ospf UNDERLAY-NET area 0.0.0.0
  ip pim sparse-mode

interface loopback77
  description ** BGP peering **
  ip address 192.168.77.103/32
  ip router ospf UNDERLAY-NET area 0.0.0.0

interface loopback100
  description ** VTEP/Overlay **
  ip address 192.168.100.103/32
  ip router ospf UNDERLAY-NET area 0.0.0.0
  ip pim sparse-mode
line console
line vty
boot nxos bootflash:/nxos.9.2.3.bin
router ospf UNDERLAY-NET
  router-id 192.168.0.103
router bgp 65000
  router-id 192.168.77.103
  address-family ipv4 unicast
  address-family l2vpn evpn
    maximum-paths 2
    maximum-paths ibgp 2
  neighbor 192.168.77.11
    remote-as 65000
    description ** Spine-11 BGP-RR **
    update-source loopback77
    address-family l2vpn evpn
      send-community extended
  vrf TENANT77
    address-family ipv4 unicast
      advertise l2vpn evpn
evpn
  vni 10000 l2
    rd auto
    route-target import auto
    route-target export auto
  vni 10001 l2
    rd auto
    route-target import auto
    route-target export auto
  vni 20000 l2
    rd auto
    route-target import auto
    route-target export auto

Author: Toni Pasanen CCIE#28158
Published: 19.6.2019
Updated: 
-------------------------------------------------
References:


RFC 7432: BGP MPLS-Based Ethernet VPN


7 comments:

  1. HI Toni,
    would you please post the full configuration of this ES part?
    also are you carrying on on this Vxlan Series?

    Cheers
    Michael

    ReplyDelete
    Replies
    1. Hi Michael,
      I will add configurations later, I'll try to be fast. And yes, I will continue writing about VXLAN.

      Delete
    2. Configurations of Leaf-102 and Leaf-103 can now be found at the end of the document.

      Delete
  2. Toni, is the MAC address listed in "system-mac 0102.0103.0234" command line specific to each pair of participating ESI switches? Meaning if we add an additional pair of switches connected to Spine 11, say Leaf-106 & Leaf 107, and they are in their own ESI group, could we use the same MAC address or because it would require different segment IDs (assume separate Top of Rack switches with it different networks), would we need to create a different system-mac address

    ReplyDelete
    Replies
    1. Even though the system-mac address is only part of the address advertised as BGP EVPN Route-Type 4 (Ethernet Segment Route) it might be a good idea to define unique system mac for each Ethernet Segment (ES). Remote VTEPs can differentiate ES based on ES Id carried in update even though we use the same sys-mac. However, LACP messages exchanged between traditional Ethernet switch (external switch) carries the system-mac information as Actor System-Id. Now if the traditional switch has more LACP peers, it might cause some troubles if it sees the same "Actor Sys-Ids" from peers belonging to different port-channel.

      Delete
  3. Hi Toni

    Thanks for your sharing. Very good article.

    No idea why Cisco doesn't support multihoming on the most of N9K models:
    EVPN Multihoming is supported on the Cisco Nexus 9300 platform switches only and it is not supported on the Cisco Nexus 9200, 9300-EX/-FX/-FXP/-FX2 and 9500 platform switches.

    And 9300 here only for Merchant Silicon product..
    N9396PX(config)# evpn esi ?
    multihoming Multihoming feature

    N9364C(config)# evpn ?
    multisite Multisite
    storm-control Storm-control

    BR,
    Qifan

    ReplyDelete
    Replies
    1. I think that the vPC Fabric-Peering is Cisco's preferred solution over ESI multihoming.

      Delete