Thursday 9 May 2019

VXLAN Underlay Routing - Part V: Multi-AS eBGP

Now you can also download my VXLAN book from the Leanpub.com 
"Virtual Extensible LAN VXLAN - A Practical guide to VXLAN Solution Part 1. (373 pages)

eBGP as an Underlay Network Routing Protocol: Multi-AS eBGP

This post introduces the Multi-AS eBGP solution in VXLAN Fabric. In this solution, a single AS number is assigned to all spine switches while each leaf switches (or pair of leaf switches) have unique BGP AS number. This solution neither requiresallowas-in” command in leaf switches nor “disable-peer-check” command in the spine switches, which are required in Two-AS solution. The “retain-route-target all” command and BGP L2VPN EVPN address family peer-specific route-map with an option “set ip next-hop-unchanged” is needed on the spine switch. This post also explains the requirements and processes for L2 EVPN VNI specific route import policy when automated derivation of Route-Targets is used. The same IP/MAC address scheme is used in this chapter than what was used in the previous post “VXLAN Underlay Routing - Part IV: Two-AS eBGP” but the Leaf-102 now belongs to BGP AS 65001.


Figure 1-1: The MAC/IP addressing scheme and eBGP peering model.


Underlay Network Control Plane: IPv4 eBGP peering

Spine-11 belongs to BGP AS 65099 and it has IPv4 BGP peering with AS external neighbors Leaf-101 on AS 65000 and Leaf-102 on AS 65001. Both Leaf switches advertise the NLRIs about their Loopback 100 (used for overlay BGP peering) and Loopback 50 (used for NVE interfaces) to Spine-11. Spine-11 advertised the NLRI information about its’ Loopback 100. In addition, Spine-11 forwards the NLRI information received from Leaf-101 to Leaf-102 and another way around. The basic BGP configuration is shown in examples 1-1 to 1-3.


Figure 1-2: VXLAN Fabric Underlay Network eBGP IPv4 peering.

router bgp 65000
  router-id 192.168.0.101
  address-family ipv4 unicast
    network 192.168.50.101/32
    network 192.168.100.101/32
  neighbor 10.101.11.11
    remote-as 65099
    description ** BGP Underlay to Spine-11 **
    address-family ipv4 unicast
Example 1-1: Leaf-101 basic IPv4 BGP peering configuration.

router bgp 65001
  router-id 192.168.0.102
  address-family ipv4 unicast
    network 192.168.50.102/32
    network 192.168.100.102/32
  neighbor 10.102.11.11
    remote-as 65099
    description ** BGP Underlay to Spine-11 **
    address-family ipv4 unicast
Example 1-2: Leaf-102 basic IPv4 BGP peering configuration.

router bgp 65000
  router-id 192.168.0.11
  address-family ipv4 unicast
    network 192.168.100.101/32
  neighbor 10.101.11.101
    remote-as 65000
    description ** BGP Underlay to Leaf-101 **
    address-family ipv4 unicast
  neighbor 10.102.11.102
    remote-as 65000
    description ** BGP Underlay to Spine-11 **
    address-family ipv4 unicast
Example 1-3: Spine-11 basic IPv4 BGP peering configuration.

Example 1-4 shows that Spine-11 has received two routes for both IPv4 BGP peers.

Spine-11# sh ip bgp summary
BGP summary information for VRF default, address family IPv4 Unicast
BGP router identifier 192.168.0.11, local AS number 65099
BGP table version is 96, IPv4 Unicast config peers 2, capable peers 2
6 network entries and 6 paths using 1392 bytes of memory
BGP attribute entries [3/480], BGP AS path entries [2/12]
BGP community entries [0/0], BGP clusterlist entries [0/0]

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.101.11.101   4 65000     274     264       96    0    0 00:00:19 2         
10.102.11.102   4 65001     286     273       96    0    0 00:00:27 2
Example 1-4: show ip bgp summary on Spine-11.

Example 1-5 shows that Leaf-101 has received and installed routes originated by Leaf-102 into BGP table.

Leaf-101# sh ip bgp | i .102
*>e192.168.50.102/32  10.101.11.11                        0 65099 65001 i
*>e192.168.100.102/32 10.101.11.11                        0 65099 65001 i
Example 1-5: show ip bgp on Spine-11.

Example 1-6 shows that there is IP connectivity between the Loopback IP addresses of Leaf-101 and Leaf-102.

Leaf-101# ping 192.168.100.102 source 192.168.100.101 count 2
<snipped>
64 bytes from 192.168.100.102: icmp_seq=0 ttl=253 time=7.896 ms
64 bytes from 192.168.100.102: icmp_seq=1 ttl=253 time=6.913 ms
<snipped>

Leaf-101# ping 192.168.50.102 source 192.168.50.101 count 2
<snipped>
64 bytes from 192.168.50.102: icmp_seq=0 ttl=253 time=6.922 ms
64 bytes from 192.168.50.102: icmp_seq=1 ttl=253 time=10.413 ms
<snipped>
Example 1-6: IP connectivity verification from Leaf-101 to Leaf-102.


Overlay Network Control Plane: L2VPN EVPN eBGP peering

While Underlay Network IPv4 BGP peering is used for IP connectivity between devices, the Overlay L2VPN EVPN BGP peering is used to advertise host related MAC/IP addresses. This section explains what import/export policy-based automated derivation of Route-Targets.


Figure 1-3: VXLAN Fabric Overlay Network eBGP L2VPN EVPN peering.

The basic EVPN L2VNI 10000 configuration on both leaf switches is illustrated in example 1-7.

evpn
  vni 10000 l2
    rd auto
    route-target import auto
    route-target export auto
Example 1-7: IP connectivity verification from Leaf-101 to Leaf-102.

The format of auto RT is “AS number:L2VNI”. Therefore, Leaf-101 export routes with RT 65000:10000 and import routes with the same RT. Leaf-102 in turn export routes with RT 65001:10000  and import routes with the same RT. This means that neither leaf switch does not import routes originated by the other remote leaf. The solution is to use L2VPN EVPN BGP peer-specific command “rewrite-evpn-rt-asn”. This command will change the AS number part from the RT to local AS on received BGP Updates. Next section explains how it works.

Example 1-8 shows that Spine-11 has received BGP Update from Leaf-101 about NLRI of host Café (IP:172.16.10.101/MAC:1000.0010.cafe).

Spine-11# sh bgp l2vpn evpn 1000.0010.cafe
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 192.168.0.101:32777
BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216,
 version 293
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is not in l2rib/evpn, is not in HW

  Advertised path-id 1
  Path type: external, path is valid, is best path
  AS-Path: 65000 , path sourced external to AS
    192.168.50.101 (metric 0) from 192.168.100.101 (192.168.0.101)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000
      Extcommunity: RT:65000:10000 ENCAP:8

  Path-id 1 advertised to peers:
    192.168.100.102
BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272, version 244
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is not in l2rib/evpn, is not in HW

  Advertised path-id 1
  Path type: external, path is valid, is best path
  AS-Path: 65000 , path sourced external to AS
    192.168.50.101 (metric 0) from 192.168.100.101 (192.168.0.101)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10077
      Extcommunity: RT:65000:10000 RT:65000:10077 ENCAP:8 Router MAC:5e00.0000.0007

  Path-id 1 advertised to peers:
    192.168.100.102
Example 1-8: BGP table on Spine-11.

Spine-11 also advertises routes to Leaf-102.

Spine-11# sh bgp l2vpn evpn neighbors 192.168.100.102 advertised-routes

Peer 192.168.100.102 routes for address family L2VPN EVPN:
BGP table version is 299, Local Router ID is 192.168.0.11
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-i
njected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup

   Network            Next Hop            Metric     LocPrf     Weight Path
Route Distinguisher: 192.168.0.101:32777
*>e[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216
                      192.168.50.101                                 0 65000 i
*>e[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272
                      192.168.50.101                                 0 65000 i

Route Distinguisher: 192.168.0.102:32777
Example 1-9: Advertised NLRIs to Leaf-102 by Spine-11.

However, the NLRI information is not installed from BGP Adj-RIB-In into Loc-RIB on Leaf-102. (received only)

Leaf-102# sh bgp l2vpn evpn 1000.0010.cafe
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 192.168.0.101:32777
BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216, version 0
Paths: (1 available, best #0)
Flags: no flags set

  Path type: external, path is valid, received only
  AS-Path: 65099 65000 , path sourced external to AS
    192.168.50.101 (metric 0) from 192.168.100.11 (192.168.0.11)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000
      Extcommunity: RT:65000:10000 ENCAP:8

BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/248, version 0
Paths: (1 available, best #0)
Flags: no flags set

  Path type: external, path is valid, received only
  AS-Path: 65099 65000 , path sourced external to AS
    192.168.50.101 (metric 0) from 192.168.100.11 (192.168.0.11)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10077
      Extcommunity: RT:65000:10000 RT:65000:10077 ENCAP:8 Router MAC:5e00.0000.0007

Example 1-10: show bgp l2vpn evpn 1000.0010.cafe on Leaf-102.

The import policy has to match with Route-Target value carried as an Extended Community in BGP Update message in order to install NLRI information from the peer-specific Adj-RIB-In into Loc-RIB. The RT of received BGP Update can be changed with BGP L2VPN EVPN peer-specific command “rewrite-evpn-rt-asn”. It changes the RT value of incoming BGP Update before installing it into Adj-RIB-In. Example 1-11 shows the configuration.

router bgp 65001
  router-id 192.168.0.102
  address-family ipv4 unicast
    network 192.168.50.102/32
    network 192.168.100.102/32
  address-family l2vpn evpn
  neighbor 10.102.11.11
    remote-as 65099
    description ** BGP Underlay to Spine-11 **
    address-family ipv4 unicast
  neighbor 192.168.100.11
    remote-as 65099
    description ** BGP Overlay to Spine-11 **
    update-source loopback100
    ebgp-multihop 2
    address-family l2vpn evpn
      send-community extended
      soft-reconfiguration inbound always
      rewrite-evpn-rt-asn
evpn
  vni 10000 l2
    rd auto
    route-target import auto
    route-target export auto

Example 1-11: BGP configuration on Leaf-102.

Adding command only to Uplink towards Spine-11 on Leaf-101 and Leaf-102 does not yet full fill the import policy requirements. BGP process compares the Route-Target AS number and configured BGP L2VPN EVPN peer AS number. In order to change the RT value and install NLRI into Loc-RIB, these two entities have to be the same. Therefore, also Spine-11 has to manipulate the RT value for BGP Updates that are received from Leaf-101. Figure 1-4 illustrates the situation where Spine-11 forwards BGP Update exported by Leaf-101 without RT manipulation. Leaf-102 does not install NLRI into Loc-RIB because the configured AS number for BGP L2VPN EVPN peer Spine-11 is different compared to Route-Target AS part of BGP Update received from Spine-11.

Figure 1-4: Route-Target rewrite process.


When the command “rewrite-evpn-rt-asn” is also added into Spine-11 configuration towards Leaf-101 and Leaf-102, leaf switches are able to first, change the RT value carried in received BGP Updates and second, install the NLRIs on the received BGP Update into the BGP Loc-RIB table. Figure 1-5 illustrates the overall process.

Step-1:
Leaf-101 sends BGP Update with RT 65000:10000 to Spine-11.

Step-2:
Spine-11 receives the BGP Update. It compares the BGP AS part from RT to configured BGP AS number towards Leaf-101.

Step-3:
Because both values are equal, Spine-11 rewrites the original AS value with its own AS.

Step-4:
Spine-11 imports the NLRI into BGP Loc-RIB table and RIB where it is sent through the Adj-RIB-Out to Leaf-102 with RT 65099:10000.

Step 5-7:
Leaf-102 does the same verification process that what Spine-11 did in phases 1-4 and import the NLRI into Loc-RIB.



Figure 1-5: Route-Target rewrite process.

Example 1-12 shows that now the NLRI originated by Leaf-101 is installed from the BGP Adj-RIB-In into Loc-RIB.

Leaf-102# sh bgp l2vpn evpn 1000.0010.cafe
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 192.168.0.101:32777
BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216, version 71
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is not in l2rib/evpn, is not in HW

  Advertised path-id 1
  Path type: external, path is valid, received and used, is best path
             Imported to 1 destination(s)
  AS-Path: 65099 65000 , path sourced external to AS
    192.168.50.101 (metric 0) from 192.168.100.11 (192.168.0.11)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000
      Extcommunity: RT:65001:10000 ENCAP:8

  Path-id 1 not advertised to any peer
BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272, version 70
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is not in l2rib/evpn, is not in HW

  Advertised path-id 1
  Path type: external, path is valid, received and used, is best path
             Imported to 3 destination(s)
  AS-Path: 65099 65000 , path sourced external to AS
    192.168.50.101 (metric 0) from 192.168.100.11 (192.168.0.11)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10077
      Extcommunity: RT:65001:10000 RT:65001:10077 ENCAP:8 Router MAC:5e00.0000.0007

  Path-id 1 not advertised to any peer

Route Distinguisher: 192.168.0.102:32777    (L2VNI 10000)
BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216, version 74
Paths: (1 available, best #1)
Flags: (0x000212) on xmit-list, is in l2rib/evpn, is not in HW

  Advertised path-id 1
  Path type: external, path is valid, is best path, in rib
             Imported from 192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216
  AS-Path: 65099 65000 , path sourced external to AS
    192.168.50.101 (metric 0) from 192.168.100.11 (192.168.0.11)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000
      Extcommunity: RT:65001:10000 ENCAP:8

  Path-id 1 not advertised to any peer
BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272, version 72
Paths: (1 available, best #1)
Flags: (0x000212) on xmit-list, is in l2rib/evpn, is not in HW

  Advertised path-id 1
  Path type: external, path is valid, is best path, in rib
             Imported from 192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272
  AS-Path: 65099 65000 , path sourced external to AS
    192.168.50.101 (metric 0) from 192.168.100.11 (192.168.0.11)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10077
      Extcommunity: RT:65001:10000 RT:65001:10077 ENCAP:8 Router MAC:5e00.0000.0007

  Path-id 1 not advertised to any peer

Route Distinguisher: 192.168.0.102:4    (L3VNI 10077)
BGP routing table entry for [2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272, version 73
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is not in l2rib/evpn, is not in HW

  Advertised path-id 1
  Path type: external, path is valid, is best path
             Imported from 192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272
  AS-Path: 65099 65000 , path sourced external to AS
    192.168.50.101 (metric 0) from 192.168.100.11 (192.168.0.11)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10077
      Extcommunity: RT:65001:10000 RT:65001:10077 ENCAP:8 Router MAC:5e00.0000.0007

  Path-id 1 not advertised to any peer

Leaf-102#
Example 1-12: BGP table on Leaf-102.

Example 1-3 shows that host Cafe (172.16.10.101/1000.0010.cafe) connected to Leaf-101 is now able to ping host Abba (172.16.10.102/1000.0010.abba) connected to Leaf-102.


Cafe#ping 172.16.10.102
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.16.10.102, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 17/25/34 ms
Cafe#
Example 1-13: Ping from host Café to host Abba.

Author: Toni Pasanen CCIE#28158
Published: 9.5.2019
Updated: 
-------------------------------------------------
References:

Building Data Center with VXLAN BGP EVPN – A Cisco NX-OS Perspective
ISBN-10: 1-58714-467-0 – Krattiger Lukas, Shyam Kapadia, and Jansen Davis

Cisco Programmable Fabric with VXLAN BGP EVPN Configuration Guide

2 comments: