Saturday, 8 June 2019

EVPN ESI Multihoming- Part II: Fast Convergence and Load Balancing


Now you can also download my VXLAN book from the Leanpub.com 


This chapter introduces the BGP EVPN Route Type1- Ethernet Auto-Discovery (Ethernet A-D) routes. The first section explains the Ethernet A-D per Ethernet Segment (ES) routes, which is mainly used for Fast Convergence. The second section discusses Ethernet A-D per EVI/ES route, which in turn is used for Load Balancing (also called Aliasing/Backup Path).



Figure 1-1: Ethernet A-D per Ethernet Segment (ES) route.



Ethernet A-D per ES route - Fast Convergence in the all-Active mode

Leaf-102 and Leaf-103 in figure 1-1 belong to the same redundancy group sharing the Ethernet Segment (ES) identified by ES Identifier (ESI) 01.02.01.03.00.02.34.04.d2 (Hex:04.2d=Bin:1234) via interface E1/2 assigned to Port-Channel234. Both interfaces are in the forwarding state (all-Active mode). ASW-104 is connected to them via Port-Channel 234. In failure event, where either Leaf -102 or Leaf-103 loose connection to ES, it has to be signaled to the remote switch Leaf-101. For this purpose, EVPN ESI Multihoming solution uses Ethernet A-D per ES routes BGP Updates.
At the very moment, when Leaf-102 and Leaf-103 joined to the ES, they generate a BGP EVPN Route-Type 4 (Ethernet A-D) route, which they advertise to Spine-11. Figure 1-1 shows some of the BGP Path Attributes (BGP PAs) carried with NLRI advertisements.

First, RT:65000:10000 and RT:65000:10001 are the same RT values that are used with MAC/IP NLRIs (BGP EVPN Route-Type 2) concerning VNI 10000 (VLAN 10) and VNI 10001 (VLAN 11). These BGP PAs are carried within the update message because both VLANs are activated in Po234. Because of both VNIs 10000 and VNI 10001 are used also in Leaf-101, it imports these NLRIs into the BGP table.

Second, EVPN ESI Multihoming uses either Single-Active mode (only one of the link connected to ES is active at a time) or an all-Active mode (all ES links are active at the same time). Leaf-102 and Leaf-103 are using all-Active mode, which they described to remote peers by setting the Single-Active bit to zero in ESI MPLS Label Extended Community.
Third, The EVPN NLRI Ethernet A-D route describes the Ethernet Segment Identifier (ESI) that is formed from the shared System-MAC and ES value defined under the Port-Channel 234 configuration (configuration can be found from the previous chapter). Note that the Ethernet Tag Id must be set maximum value 4294967295 (= HEX: ffff:ffff) and the MPLS label must be set to zero (RFC 7432 - section 8.2.1).


Spine-11 is BGP Route-Reflector and it forwards the BGP Update messages to Leaf-101. Example 1-1 illustrates the Leaf-101 BGP table.  Highlighted entries are the BGP EVPN Ethernet A-D per ES routes sent by Leaf-102 and Leaf-103. When these routes are imported from the BGP Adj-RIB-In into Loc-RIB, Leaf-101 changes the RD to its own RD 192.168.77.101:65534. Also, notice that the Ethernet A-D per ES is not L2VNI specific update and that is why it is shown as L2VNI 0.

Leaf-101# sh bgp l2vpn evpn
<snipped>
   Network            Next Hop            Metric     LocPrf     Weight Path
Route Distinguisher: 192.168.77.101:32777    (L2VNI 10000)
*|i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
                      192.168.100.103                   100          0 i
*>i                   192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.101:65534    (L2VNI 0)
*|i[1]:[0301.0201.0302.3400.04d2]:[0xffffffff]/152
                      192.168.100.103                   100          0 i
*>i                   192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.102:40
*>i[1]:[0301.0201.0302.3400.04d2]:[0xffffffff]/152
                      192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.102:32777
*>i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
                      192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.103:40
*>i[1]:[0301.0201.0302.3400.04d2]:[0xffffffff]/152
                      192.168.100.103                   100          0 i

Route Distinguisher: 192.168.77.103:32777
*>i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
                      192.168.100.103                   100          0 i
Example 1-1: Leaf-101 BGP table.

Now the host Abba (IP: 172.16.10.104/MAC: 1000.0010.abba) joins the network (figure 1-2). Both Leaf-102 and Leaf-103 learns the MAC address information from incoming traffic. They both install information into MAC VRF where they exported information into BGP process and advertises it to BGP EVPN peer Spine-11. The EVPN NLRI MAC Advertisement route (route-type 2) includes the ESI value and Ethernet Tag Id among the RD and MAC/IP information. The ESI type-3 indicates that this is a MAC-based ESI value constructed from the system-MAC and the Local Discriminator value. The Ethernet Tag Id for VLAN-based Service Interface (EVPN Instance is single VLAN) is set zero (RFC 7432 – section 6.1).


Figure 1-2: BGP EVPN Route-Type 2 MAC/IP advertisement.


Examples 1-2 and 1-3 shows that Leaf-101 have received and imported both updates into its BGP table.

Leaf-101# sh bgp l2vpn evpn rd 192.168.77.102:32777
<snipped>
BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.104]/272, version 105
Paths: (1 available, best #1)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW
Multipath: eBGP iBGP

  Advertised path-id 1
  Path type: internal, path is valid, is best path, no labeled nexthop
             Imported to 2 destination(s)
  AS-Path: NONE, path sourced internal to AS
    192.168.100.102 (metric 81) from 192.168.77.11 (192.168.77.111)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10077
      Extcommunity: RT:65000:10000 RT:65000:10077 ENCAP:8 Router MAC:5000.0003.0007
      Originator: 192.168.77.102 Cluster list: 192.168.77.111
      ESI: 0301.0201.0302.3400.04d2

  Path-id 1 not advertised to any peer
Example 1-2: Leaf-101 BGP table – MAC Advertisement originated by Leaf-102.


Leaf-101# sh bgp l2vpn evpn rd 192.168.77.103:32777
<snipped>
BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.104]/272, version 21
Paths: (1 available, best #1)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW
Multipath: eBGP iBGP

  Advertised path-id 1
  Path type: internal, path is valid, is best path, no labeled nexthop
             Imported to 2 destination(s)
  AS-Path: NONE, path sourced internal to AS
    192.168.100.103 (metric 81) from 192.168.77.11 (192.168.77.111)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10077
      Extcommunity: RT:65000:10000 RT:65000:10077 ENCAP:8 Router MAC:5000.0004.0007
      Originator: 192.168.77.103 Cluster list: 192.168.77.111
      ESI: 0301.0201.0302.3400.04d2

  Path-id 1 not advertised to any peer
Example 1-3: Leaf-101 BGP table – MAC Advertisement originated by Leaf-103.

At this phase, the remote switch Leaf-101 knows that Leaf-102 and Leaf-103 belong to the same redundancy group because they both have advertised the same ESI value 01.02.01.03.02.34.00.04.d2 by using Ethernet A-D route. Leaf-101 also knows that host Abba is reachable via Leaf-102 and Leaf-103, based on the MAC advertisement route, which in addition the MAC-IP addresses information includes the same ESI value than what was received from Leaf-102 and Leaf-103 via Ethernet A-D route advertisement. Note, the ESI value is not carried within the MAC-only advertisement route that carries only MAC information as can be seen from capture 1-1 below taken from Leaf-102.


Ethernet II, Src: 1e:af:01:02:1e:11, Dst: c0:8e:00:11:1e:12
Internet Protocol Version 4, Src: 192.168.77.102, Dst: 192.168.77.11
Transmission Control Protocol, Src Port: 56613, Dst Port: 179, Seq: 166, Ack: 180, Len: 112
Border Gateway Protocol - UPDATE Message
    Marker: ffffffffffffffffffffffffffffffff
    Length: 112
    Type: UPDATE Message (2)
    Withdrawn Routes Length: 0
    Total Path Attribute Length: 89
   
        Path Attribute - MP_REACH_NLRI
            Type Code: MP_REACH_NLRI (14)
            Length: 44
            Address family identifier (AFI): Layer-2 VPN (25)
            Subsequent address family identifier (SAFI): EVPN (70)
            Next hop network address (4 bytes)
            Number of Subnetwork points of attachment (SNPA): 0
            Network layer reachability information (35 bytes)
             EVPN NLRI: MAC Advertisement Route
               Route Type: MAC Advertisement Route (2)
               Length: 33
               Route Distinguisher: 192.168.77.102:32777
               ESI: 00 00 00 00 00 00 00 00 00
               Ethernet Tag ID: 0
               MAC Address Length: 48
               MAC Address: 10:00:00:10:ab:ba
               IP Address Length: 0
               IP Address: NOT INCLUDED
               MPLS Label Stack 1: 625, (BOGUS: Bottom of Stack NOT set!)
Capture 1-1: MAC only Advertisement originated by Leaf-102.

Now we generate data flow from Abba to Cafe by using ping.

Abba#ping 172.16.10.101
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.16.10.101, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 48/53/64 ms
Example 1-4: ping from Abba to Cafe.

Example 1-5 shows that Leaf-101 has learned the MAC address of Abba from Leaf-102. This is because the LACP hashing algorithm has chosen the interface E1/1 (to Leaf-102) for dataflow between MAC address 1000.0010.abba and 1000.0010.cafe.

Leaf-101# show l2route evpn mac evi 10

Flags -(Rmac):Router MAC (Stt):Static (L):Local (R):Remote (V):vPC link
(Dup):Duplicate (Spl):Split (Rcv):Recv (AD):Auto-Delete (D):Del Pending
(S):Stale (C):Clear, (Ps):Peer Sync (O):Re-Originated (Nho):NH-Override
(Pf):Permanently-Frozen, (Orp): Orphan

Topology    Mac Address    Prod   Flags     Seq No     Next-Hops
----------- -------------- ------ --------- ---------- ----------------
10          1000.0010.abba BGP    SplRcv    12         192.168.100.102
10          1000.0010.cafe Local  L,        0          Eth1/3
Example 1-5: Leaf-101 L2RIB.

Now we generate another dataflow between host Abba and Cafe by using Telnet.

Abba#telnet 172.16.10.101
Trying 172.16.10.101 ... Open
Password required, but none set
[Connection to 172.16.10.101 closed by foreign host]
Example 1-6: Telnet from Abba to Cafe.

Now Leaf-101 MAC address table points to Leaf-103. This is because the LACP hashing algorithm has now chosen the interface E1/2 for this dataflow.

Leaf-101# show l2route evpn mac evi 10

Flags -(Rmac):Router MAC (Stt):Static (L):Local (R):Remote (V):vPC link
(Dup):Duplicate (Spl):Split (Rcv):Recv (AD):Auto-Delete (D):Del Pending
(S):Stale (C):Clear, (Ps):Peer Sync (O):Re-Originated (Nho):NH-Override
(Pf):Permanently-Frozen, (Orp): Orphan

Topology    Mac Address    Prod   Flags     Seq No     Next-Hops
----------- -------------- ------ --------- ---------- ----------------
10          1000.0010.abba BGP    SplRcv    12         192.168.100.103
10          1000.0010.cafe Local  L,        0          Eth1/3
Example 1-7: Leaf-101 L2RIB.

Example 1-8 below shows that the location of MAC address 1000.0010.abba from Leaf-101 perspective has changed 18 times.

Leaf-101# show bgp l2vpn evpn 1000.0010.abba
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 192.168.77.101:32777    (L2VNI 10000)
BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216, version 87
Paths: (1 available, best #1)
Flags: (0x000212) (high32 00000000) on xmit-list, is in l2rib/evpn, is not in HW
Multipath: eBGP iBGP

  Advertised path-id 1
  Path type: internal, path is valid, is best path, no labeled nexthop, in rib
             Imported from 192.168.77.102:32777:[2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216
  AS-Path: NONE, path sourced internal to AS
    192.168.100.102 (metric 81) from 192.168.77.11 (192.168.77.111)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000
      Extcommunity: RT:65000:10000 ENCAP:8 MAC Mobility Sequence:00:18
      Originator: 192.168.77.102 Cluster list: 192.168.77.111

  Path-id 1 not advertised to any peer
Example 1-8: Leaf-101 BGP table – MAC Mobility.

The example above illustrates that the latest Ethernet frame from host defines its’ location.


Fast Convergence

In a failure event, where leaf-103 loses its connection to the ES via Po234, it sends a BGP Update message where it was withdrawn all routes related to ES. When Leaf-101 receives this message, it removes routes included in the withdrawn message and updates the next-hop addresses.

Example 1-9 illustrates the Leaf-101 the BGP table before Leaf-103 generates the withdrawn message caused by link failure. Highlighted entries will be removed by Leaf-101 when it receives the withdrawn message.

Leaf-101# sh bgp l2vpn evpn
BGP routing table information for VRF default, address family L2VPN EVPN
BGP table version is 152, Local Router ID is 192.168.77.101
<snipped>

   Network            Next Hop            Metric     LocPrf     Weight Path
Route Distinguisher: 192.168.77.101:32777    (L2VNI 10000)
*>i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
                      192.168.100.102                   100          0 i
*|i                   192.168.100.103                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216
                      192.168.100.103                   100          0 i
*|i[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.104]/272
                      192.168.100.103                   100          0 i
*>i                   192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.101:65534    (L2VNI 0)
*>i[1]:[0301.0201.0302.3400.04d2]:[0xffffffff]/152
                      192.168.100.102                   100          0 i
*|i                   192.168.100.103                   100          0 i

Route Distinguisher: 192.168.77.102:40
*>i[1]:[0301.0201.0302.3400.04d2]:[0xffffffff]/152
                      192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.102:32777
*>i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
                      192.168.100.102                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.104]/272
                      192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.102:32778
*>i[2]:[0]:[0]:[48]:[1000.0111.beef]:[32]:[172.16.11.104]/272
                      192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.103:40
*>i[1]:[0301.0201.0302.3400.04d2]:[0xffffffff]/152
                      192.168.100.103                   100          0 i

Route Distinguisher: 192.168.77.103:32777
*>i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
                      192.168.100.103                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216
                      192.168.100.103                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.104]/272
                      192.168.100.103                   100          0 i

Route Distinguisher: 192.168.77.103:32778
*>i[2]:[0]:[0]:[48]:[1000.0111.beef]:[32]:[172.16.11.104]/272
                      192.168.100.103                   100          0 i
Example 1-9: BGP table on Leaf-101 before Leaf-103 withdrawn message.

Capture 1-2 shows the Unreachable NLRIs withdrawn by Leaf-103 after a link failure (Po234).

Ethernet II, Src: c0:8e:00:11:1e:11 (c0:8e:00:11:1e:11), Dst: 1e:af:01:01:1e:11 (1e:af:01:01:1e:11)
Internet Protocol Version 4, Src: 192.168.77.11, Dst: 192.168.77.101
Transmission Control Protocol, Src Port: 179, Dst Port: 56294, Seq: 56, Ack: 20, Len: 111
Border Gateway Protocol - UPDATE Message
    Marker: ffffffffffffffffffffffffffffffff
    Length: 111
    Type: UPDATE Message (2)
    Withdrawn Routes Length: 0
    Total Path Attribute Length: 88
    Path attributes
        Path Attribute - MP_UNREACH_NLRI
            Type Code: MP_UNREACH_NLRI (15)
            Length: 84
            Address family identifier (AFI): Layer-2 VPN (25)
            Subsequent address family identifier (SAFI): EVPN (70)
            Withdrawn routes (81 bytes)
                EVPN NLRI: Ethernet AD Route
                    Route Type: Ethernet AD Route (1)
                    Length: 25
                    Route Distinguisher: 192.168.77.103:40
                    ESI: 01:02:01:03:02:34, Discriminator: 00 04
                    Ethernet Tag ID: 4294967295
                    MPLS Label Stack 1: 0 (withdrawn)
                EVPN NLRI: Ethernet AD Route
                    Route Type: Ethernet AD Route (1)
                    Length: 25
                    Route Distinguisher: 192.168.77.103:32777
                    ESI: 01:02:01:03:02:34, Discriminator: 00 04
                    Ethernet Tag ID: 0
                    MPLS Label Stack 1: 0 (withdrawn)
                EVPN NLRI: Ethernet AD Route
                    Route Type: Ethernet AD Route (1)
                    Length: 25
                    Route Distinguisher: 192.168.77.103:32778
                    ESI: 01:02:01:03:02:34, Discriminator: 00 04
                    Ethernet Tag ID: 0
                    MPLS Label Stack 1: 0 (withdrawn)
Capture 1-2: BGP Update message (withdrawn) sent by Leaf-103.

After receiving the BGP withdrawn message from Leaf-103, Leaf removes all the withdrawn routes and updates the next-hop addresses.

Leaf-101# sh bgp l2vpn evpn
BGP routing table information for VRF default, address family L2VPN EVPN
<snipped>

   Network            Next Hop            Metric     LocPrf     Weight Path
Route Distinguisher: 192.168.77.101:32777    (L2VNI 10000)
*>i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
                      192.168.100.102                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216
                      192.168.100.102                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.104]/272
                      192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.101:65534    (L2VNI 0)
*>i[1]:[0301.0201.0302.3400.04d2]:[0xffffffff]/152
                      192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.102:40
*>i[1]:[0301.0201.0302.3400.04d2]:[0xffffffff]/152
                      192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.102:32777
*>i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
                      192.168.100.102                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216
                      192.168.100.102                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.104]/272
                      192.168.100.102                   100          0 i

Route Distinguisher: 192.168.77.102:32778
*>i[2]:[0]:[0]:[48]:[1000.0111.beef]:[32]:[172.16.11.104]/272
                      192.168.100.102                   100          0 i
Example 1-10: BGP table on Leaf-101 after Leaf-103 withdrawn message.

Example 1-11 shows the L2RIB of Leaf-101 before changes and example 1-12 after changes. Even though there has not been any MAC address moving event, the MAC address table information is updated. Leaf-101 knows that even host Abba is reachable via Leaf-102 because the MAC advertisement of host Abba has the same ESI than what was previously received from Leaf-102.

Leaf-101# sh system internal l2fwder mac | i abba
*    10    1000.0010.abba    static   -  F  F  nve-peer1 192.168.100.103
Example 1-11: Leaf-101 MAC address table before withdrawn.

Leaf-101# sh system internal l2fwder mac | i abba
*    10    1000.0010.abba    static   -  F  F  nve-peer1 192.168.100.102
Example 1-12: Leaf-101 MAC address table after withdrawn.


Load Balancing (Aliasing)

Figure 1-3 illustrates the situation where Leaf-102 and Leaf 103 have already sent the BGP EVPN Ethernet A-D ES route where they describe the local redundancy mode (all-Active) used with Ethernet Segment to remote peer Leaf-101. This way Leaf-101 know that both Leaf-102 and Leaf-103 are able to forward data to clients behind the Ethernet Segment (ES). However, Leaf-101 does not know which VNIs or clients are reachable through particular ES. For VNI information, Leaf-102 and Leaf-103 originates BGP EVPN Ethernet A-D EVI/ES routes, where they tell that RD 192.168.77.102/103:32777 (used with MAC advertisement route for VNI 10000) can be found behind ES 0301.0201.0302.3400.004d2. Leaf-101 imports these routes based on the RT 65000:10000 (used with VNI10000).

Based on these two Ethernet A-D routes (ES + EVI/ES) Leaf-101 know that MAC/IP routes advertised with RD 192.168.102/103:32777 are reachable via both Leaf-102 and Leaf-103. Next, host Abba joins the network and sends a GARP message. The message reaches to ASW-104, which LACP hashing algorithm selects interface E1/1. This means that local MAC learning is done only by Leaf-102. It installs the route to MAC VRF and exports it into BGP process where it is sent as BGP EVPN MAC advertisement route update. Leaf-101 learns the host Abba MAC address only from the Leaf-102, but still based on Ethernet A-D ES and EVI/ES messages it knows that MAC 1000.0010.abba is reachable through the Leaf-102 and Leaf-103. The partial output in example 1-13 shows that the MAC/IP address of host Abba is reachable through the Leaf-102 and Leaf-103 and data towards Abba can be load balanced.

Route Distinguisher: 192.168.77.101:32777    (L2VNI 10000)
*>i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
                      192.168.100.102                   100          0 i
*|i                   192.168.100.103                   100          0 i
*>i[2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216
                      192.168.100.103                   100          0 i
*|i[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.104]/272
                      192.168.100.103                   100          0 i
*>i                   192.168.100.102                   100          0 i
Example 1-13: Partial BGP table of Leaf-101.

This method, when using the all-Active mode is also called Aliasing.  If Single-Active mode is used the term Backup Path is used.


Figure 1-3: BGP EVPN ESI Multihoming Load Balancing (Aliasing).


Summary

This chapter describes the Fast Convergence mechanism used in BGP EVPN ESI Multihoming solution by using Ethernet A-D ES route. In addition, this chapter introduces the Load Balancing method, which relies on the information received via Ethernet A-D EVI/ES and Ethernet A-D ES route together with the MAC Advertisement route.



Author: Toni Pasanen CCIE#28158
Published: 8.6.2019
Updated: 
References:
RFC 7432: BGP MPLS-Based Ethernet VPN

12 comments:

  1. HI Toni,
    so far we have know about BGP EVPN type 1 (A-D) ,2(mac-only/mac-ip) ,4(ES peer), 5(external prefix) in detail.
    would you please elaborate type 3

    Best Regards
    Michael

    ReplyDelete
    Replies
    1. Route-type 3 is related to Ingres Replication, which is used for BUM traffic forwarding in VXLAN networks where Multicast is not enabled in an Underlay Net. Let’s say that we have four VTEPs switches A, B, C, and D, which all hosts L2VNI 10000 (vlan 10). If switch A receives BUM traffic from one of its’ connected host (vlan 10) It forwards traffic to B, C, and D. Forwarding decision is based on either statically defined peer-list or by peer learned via BGP EVPN route type 3 advertisement.
      Interface NVE1>ingress-replication protocol static>peer-ip B, peer-ip C, peer-ip D
      Or
      Interface NVE1>ingress-replication protocol bgp
      Cheers -Toni

      Delete
    2. Hi Toni,
      outstanding elaboration!
      Whenever someone asked me about differences for these NLRIs, I am stuck.
      bringing real scenarios helps to memorize.

      All the best for you and your efforts helps many people. you should write a book since many people do not know your blog.

      Cheers
      Michael

      Delete
    3. I would need a lot of time to write a book and I am not sure where can I find it. Maybe some day.

      Delete
  2. Hi Toni,
    is there a typo in Example 1-1: Leaf-101 BGP table?
    the two highlighted routes from Leaf102 and leaf103 have RD Route Distinguisher: 192.168.77.102:40
    while I think they should be Route Distinguisher: 192.168.77.101:65534.

    Best Regards
    Michael

    ReplyDelete
    Replies
    1. Hi Michael,
      The output is taken straight from the Leaf-101 without any modifications. Leaf-102 and Leaf-103 generate the RD and I have not figure out where the last part 40 comes from. Esi multihoming is not enabled on Leaf-101 and that is probably why the last part of RD changes radically when type-1 routes are imported from Adj-RIB-In to Loc-RIB.

      Cheers - Toni

      Delete
    2. Hi Toni,
      for this case I checked online for days and here is my understanding, please correct me if I am wrong.
      from your example 1-1,leaf101 receive two BGP update (NLRI type 1)from both leaf102 and leaf103:
      take this section for demonstration
      Route Distinguisher: 192.168.77.103:40
      *>i[1]:[0301.0201.0302.3400.04d2]:[0xffffffff]/152
      192.168.100.103 100 0 i

      Route Distinguisher: 192.168.77.103:32777
      *>i[1]:[0301.0201.0302.3400.04d2]:[0x0]/152
      192.168.100.103 100 0 i

      the first one has ethernet tag for FFFF:FFF which means this update is ES only, which in my mind,is for telling Leaf101 that leaf102 and leaf103 are an ES and this helps for fast convergence. RD here is partial significant 192.168.77.103:32777 192.168.77.103 (is for identity) 40(just need to be unique)
      the second has ethernet tag for 0 and this message tells Leaf101 which VNI is reachable behind the ES(as explained from your POST for alising),RD here is complete significant 192.168.77.103:32777 192.168.77.103(tells leaf101 the identity):32777(is a mapping value of VNI,from this value Leaf101 knows what is the VNI behind ES)
      from http://bgphelp.com/2017/04/03/evpn-type-1-ethernet-auto-discovery-explained/
      I can see juniper use value 0 rather than 40 for fast covergence. I do not know whether this is a fixed mechanism for calculating this value or this is generated randomly.

      Michael

      Delete
    3. Hi Michael,
      You are right, ”Ethernet Tag Id” must be set to MAX-ET (FFFFFFFF) on ES route, while in VNI specific route it can be either vlan Id or 0. ”Assigned number” subfield value 32777 identifies the VNI - Cisco base value: 32767 + Vlan Id 10, which is mapped to VNI10000. If the local vlan-to-vni mapping differs in local and remote leaf, the ”Assigned number” sub-field might change during the import process from Adj-RIB-In to Loc-RIB. If vlan 20 is mapped to vni 10000 in Leaf-101, then the value would be 32787 in Loc-RIB even though the original, received value was 32777.

      Delete
  3. Are you sure with the mac mobility output? It doesnt make any sense to me :-D. Mac mobility is used when host is seen behind different ES, not because of all active setup. This is one of the prime feature of evpn, no mac flip-flapping in all active mode. After all this is written in RFC:
    "A remote PE that receives a MAC/IP Advertisement
    route with a non-reserved ESI SHOULD consider the advertised MAC
    address to be reachable via all PEs that have advertised reachability
    to that MAC address's EVI/ES via the combination of an Ethernet A-D
    per EVI route for that EVI/ES (and Ethernet tag, if applicable) AND
    an Ethernet A-D per ES route for that ES."

    "It is possible for a given host or end-station (as defined by its MAC
    address) to move from one Ethernet segment to another; this is
    referred to as 'MAC Mobility' or 'MAC move'"
    Thanks!

    ReplyDelete
  4. You are right, MAC move counter indicates the MAC move, though it happen within ES. However, there is no BGP Update concerning this event. So can we call this "semi MAC move" :-)

    ReplyDelete
  5. I know I'm commenting on a 3 year old post, but just in case it helps anyone else...

    I hit the same issue as the above commenter where MAC addresses on the same ES would flap between multihomed VTEPs whenever LACP hashed the frame to a diffetent VTEP. In my case I was seeing bgp updates each time too; my BGP table version number quickly climbed into the thousands.

    On further investigation I found that the ESI extended community was being included in type-2 mac+ip evpn routes, but NOT in type-2 mac-only evpn routes. So "sh l2route evpn mac all" was showing constant duplicates and mac mobility sequence increases, but "sh l2route evpn mac-ip all" was fine and showing peer-synced routes.

    Since nxos can infer mac routes from mac+ip routes, I worked around the problem with "int nve1 -> suppress mac-route". Ie. Only ever send type2 mac+ip routes, which contain the ESI community.

    Also: absolutely fantastic blog Toni! I've just purchased two of your books.

    ReplyDelete
  6. Hi, I was doing testing with port-active scenario and i found that convergence time is 37seconds if one of the port is down, any remedy for this ?

    ReplyDelete