Now you can also download my VXLAN book from the Leanpub.com
"Virtual Extensible LAN VXLAN - A Practical guide to VXLAN Solution Part 1. (373 pages)
eBGP as an Underlay Network Routing Protocol: Two-AS eBGP
"Virtual Extensible LAN VXLAN - A Practical guide to VXLAN Solution Part 1. (373 pages)
eBGP as an Underlay Network Routing Protocol: Two-AS eBGP
Figure 1-1 illustrates the topology used in this chapter. Leaf-101 and Leaf-102 both belong to BGP AS 65000, while Spine-11 belongs to BGP AS 65099. Loopback interfaces used for Overlay Network BGP peering (L100) and for NVE peering (L50) are advertised over BGP AFI IPv4 peering (Underlay Network Control Plane). Host MAC/IP address information is advertised over BGP AFI L2VPN EVPN peering (Overlay Network Control Plane). Ethernet frames between host Café and Abba are encapsulated with a VXLAN tunnel header where the source and destination IP addresses used in the outer IP header are taken from NVE1 interfaces.
Figure 1-1: High-Level
operation of VXLAN Fabric
Underlay
Network Control Plane eBGP
Figure 1-2 illustrates the Underlay Network addressing scheme
and the peering model. BGP IPv4 peering
is configured between the physical interfaces. Examples 1-1 through the 1-3
shows the basic BGP IPv4 peering configurations of switches. Both Leaf-101 and
Leaf-102 advertise the IP addresses of
Loopback 50 and Loopback 100 to Spine-11 while Spine-11 only advertises IP
address of Loopback 50 to leaf switches.
Figure 1-2: VXLAN Fabric
Underlay Network eBGP IPv4 peering.
router bgp 65000
router-id 192.168.0.101
address-family ipv4
unicast
network
192.168.50.101/32
network
192.168.100.101/32
neighbor 10.101.11.11
remote-as 65099
description ** BGP
Underlay to Spine-11 **
address-family ipv4
unicast
Example
1-1: Leaf-101 basic BGP
configuration.
router bgp 65000
router-id 192.168.0.102
address-family ipv4
unicast
network 192.168.50.102/32
network 192.168.100.102/32
neighbor 10.102.11.11
remote-as 65099
description ** BGP
Underlay to Spine-11 **
address-family ipv4
unicast
Example
1-2: Leaf-102 basic BGP configuration.
router bgp 65000
router-id 192.168.0.11
address-family ipv4
unicast
network
192.168.100.101/32
neighbor 10.101.11.101
remote-as 65000
description ** BGP
Underlay to Leaf-101 **
address-family ipv4
unicast
neighbor 10.102.11.102
remote-as 65000
description ** BGP
Underlay to Spine-11 **
address-family ipv4
unicast
Example
1-3: Spine-11 basic BGP
configuration.
At this stage, the BGP peering between
Leaf-101 and Spine-11 is up as can be seen from example 1-4.
Leaf-101# sh ip bgp summ |
beg Neigh
Neighbor V AS MsgRcvd MsgSent TblVer
InQ OutQ Up/Down State/PfxRcd
10.101.11.11 4 65099 2969
2954 11 0
0 02:27:42 2
Example
1-4: Leaf-101 BGP peering.
However, there are no entries concerning routes originated by Leaf-102 in Leaf-101
BGP table (example 1-5).
Leaf-101# sh ip bgp
<snipped>
Network Next Hop
Metric LocPrf Weight Path
*>l192.168.50.101/32 0.0.0.0 100 32768
i
*>e192.168.100.11/32 10.101.11.11 0 65099 i
*>l192.168.100.101/32 0.0.0.0 100
32768 i
*>e192.168.238.0/29 10.101.11.11 0 65099 i
Example
1-5: Leaf-101 BGP routes.
There are two reasons why BGP Updates originated by Leaf-102
does not end up to BGP table of Leaf-101. First, Spine-11 does not forward BGP
updates received from Leaf-102 to Leaf-101. Example 1-6 shows that only
self-originated routes are advertised to
Leaf-101 by Spine-11. This is because the AS-PATH Path Attribute carried in BGP
Update message includes the AS 65000 that is also used in Leaf-101 specific
IPv4 BGP peering AS configuration. This is the default loop-prevention
mechanism.
Spine-11# sh ip bgp
neighbors 10.101.11.101 advertised-routes
<snipped>
Network Next Hop Metric LocPrf
Weight Path
*>l192.168.100.11/32
0.0.0.0
100 32768 i
*>l192.168.238.0/29
0.0.0.0
100 32768 i
Example
1-6: Routes advertised to
Leaf-101 by Spine-11.
Disabling peer-AS verification process
before sending BGP Update with command disable-peer-as-check
(example 1-7) changes this default behavior.
router bgp 65099
neighbor 10.101.11.101
address-family ipv4
unicast
disable-peer-as-check
Example
1-7: Disabling peer-AS
verification on Spine-11.
As can be seen from the example
1-8, Spine-11 now forwards BGP Update received from Leaf-102 to
Leaf-101.
Spine-11#
sh ip bgp neighbors 10.101.11.101
advertised-routes
<snipped>
Network Next Hop Metric LocPrf Weight Path
*>e192.168.50.102/32 10.102.11.102
0 65000 i
*>l192.168.100.11/32 0.0.0.0 100 32768 i
*>e192.168.100.102/32
10.102.11.102 0 65000 i
*>l192.168.238.0/29 0.0.0.0
100 32768
i
Example
1-8: Routes advertised to
Leaf-101 by Spine-11.
The second reason why routes do not end up into Leaf-101
BGP table is that even though Leaf-101 receives routes, it rejects them. BGP
process discards BGP Updates messages learned from an eBGP peer, which carries receiving device AS Area information in
its AS-Path list. This is a default BGP
loop prevention mechanism. This default behavior can be bypassed with “allowas-in” command under a peer-specific configuration (example 1-9).
router
bgp 65000
neighbor 10.101.11.11
address-family ipv4 unicast
allowas-in
3
Example
1-9: Allow-as in on Leaf-101.
After this addition, Leaf-101 accepts and installs routes
originated by Leaf-102 into BGP table (example 1-10).
Leaf-101#
sh ip bgp
<snipped>
Network
Next Hop Metric
LocPrf Weight Path
*>l192.168.50.101/32 0.0.0.0 100 32768 i
*>e192.168.50.102/32 10.101.11.11
0 65099 65000 i
*>e192.168.100.11/32 10.101.11.11 0 65099 i
*>l192.168.100.101/32
0.0.0.0 100
32768 i
*>e192.168.100.102/32
10.101.11.11 0
65099 65000 i
*>e192.168.238.0/29 10.101.11.11 0 65099 i
Example
1-10: Allow-as in on Leaf-101.
The IP connectivity between the Leaf switches can now be
verified by pinging between the Loopback interfaces (example 1-11).
Leaf-101# ping 192.168.100.102 source 192.168.100.101
count 2
<snipped>
64 bytes
from 192.168.100.102: icmp_seq=0 ttl=253 time=9.268 ms
64 bytes
from 192.168.100.102: icmp_seq=1 ttl=253 time=6.586 ms
<snipped>
Leaf-101# ping 192.168.50.102 source 192.168.50.101
count 2
<snipped>
64 bytes
from 192.168.50.102: icmp_seq=0 ttl=253 time=27.166 ms
64 bytes
from 192.168.50.102: icmp_seq=1 ttl=253 time=17.275 ms
Example
1-11: ping from Leaf-101 to
Leaf-102.
Overlay
Network Control Plane eBGP
Figure 1-3 illustrates the Overlay Network addressing scheme
and peering topology. BGP L2VPN EVPN peering is configured between Loopback 100
interfaces. Examples 1-11 and 1-12 show
the basic BGP L2VPN EVPN afi peering configurations of switches.
Figure 1-3: VXLAN Fabric Overlay
Network eBGP L2VPN EVPN peering.
Both Leaf-switches can use the same configuration template
if BGP L2VPN EVPN peering is configured between Loopback Interfaces. BGP sets
TTL for BGP OPEN message to one by default. When peering between logical
interfaces instead of the directly
connected physical interface, the default TTL value one has to be manually increased
by one with “ebgp-multihop 2” command. In addition, peering between the logical loopback
interfaces requires the update-source IP address modification since the IP
address of the outgoing physical
interface is used as a source IP for BGP messages sent to the external peer by default. This is achieved by
using “update-source loopback 100” command under peer-specific
configuration section. In addition, the same BGP loop-prevention mechanism that
rejects routes with own AS-number applies also in Overlay Network, and “allowas-in” is needed
router
bgp 65000
neighbor 192.168.100.11
remote-as 65099
description ** BGP Overlay to Spine-11 **
update-source loopback100
ebgp-multihop 2
address-family l2vpn evpn
allowas-in 3
send-community
send-community extended
Example
1-12: Basic BGP L2VPN EVPN
configuration on Leaf-101and Leaf-102.
Example 1-13 illustrates the Spine-11
BGP L2VPN EVPN peering configuration with Leaf-101. The “disable-peer-as-check” command is needed in Overlay BGP L2VPN EVPN
peering just like it was needed in Underlay BGP IPv4 peering.
router
bgp 65099
neighbor 192.168.100.101
remote-as 65099
description ** BGP Overlay to Leaf-101 **
update-source loopback100
ebgp-multihop 2
address-family l2vpn evpn
disable-peer-as-check
send-community
send-community extended
Example
1-13: Basic BGP L2VPN EVPN
peering configuration on Leaf-102.
Now the BGP L2VPN EVPN peering is up, though Spine-11 has
not installed any routes from neither Leaf-101 nor Leaf-102 into its BGP table.
Spine-11#
sh bgp l2vpn evpn summary
BGP
summary information for VRF default, address family L2VPN EVPN
BGP
router identifier 192.168.0.11, local AS number 65099
BGP
table version is 4, L2VPN EVPN config peers 2, capable peers 2
0
network entries and 0 paths using 0 bytes of memory
BGP
attribute entries [0/0], BGP AS path entries [0/0]
BGP
community entries [0/0], BGP clusterlist entries [0/0]
Neighbor V
AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
192.168.100.101
4 65000 6 6
4 0 0 00:00:13 0
192.168.100.102
4 65000 6 6
4 0 0 00:00:03 0
Example
1-14: show bgp l2vpn evpn summry.
L2VPN EVPN NLRIs are imported/exported based on
Route-Target (RT) values. In Leaf-101and Leaf-102, there is an evpn instance where the import/export policy
has been defined (example 1-15).
Leaf-101#
sh run bgp | sec evpn
<snipped>
evpn
vni
10000 l2
rd auto
route-target import auto
route-target export auto
route-target both auto evpn
Example
1-15: evpn vni 10000 Route-Target
import/export policy on Leaf-101.
There is no local EVPN instance configured
on Spine-11, therefore it does not forward EVPN updates received from one eBGP
peer to another eBGP peer. This rule
applies to eBGP peering. In order to Spine-11 operate like a route-reflector,
the command “retain route-target” is needed under global BGP L2VPN EVPN
address-family (example -16). This way also the next-hop address carried in the update is retained.
Spine-11(config)#
router bgp 65099
Spine-11(config-router)#
address-family l2vpn evpn
Spine-11(config-router-af)#
retain route-target all
Example
1-16: retain route-target all
command on Spine-11.
Now, when the BGP L2VPN EVPN NLRIs are recent to Spine by
Leaf-101…
Leaf-101#
clear bgp l2vpn evpn 192.168.100.11 soft
out
Example
1-17: clear bgp l2vpn evpn on Leaf-101.
… the MAC-only and MAC-IP NLRIs are received and installed
into BGP table of Spine-11. Note! Timestamps
are removed (entries are updated from bottom to top).
Spine-11# sh bgp internal event-history events | i cafe
RIB: [L2VPN EVPN] Add/delete 192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/112,
flags=0x200, evi_ctx invalid, in_rib: no
RIB: [L2VPN EVPN] Add/delete
192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/144,
flags=0x200, evi_ctx invalid, in_rib: no
BRIB: [L2VPN EVPN]
(192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/112
(192.168.100.101)): returning from bgp_brib_add, reeval=0new_path: 1, change:
1, undelete: 0, history: 0, force: 0, (pflags=0x40002020) rnh_flag_change 0
BRIB: [L2VPN EVPN]
(192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/112
(192.168.100.101)): bgp_brib_add: handling nexthop, path->flags2: 0x80000
BRIB: [L2VPN EVPN] Created new path to
192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/112 via
192.168.0.101 (pflags=0x40000000, pflags2=0x0)
BRIB: [L2VPN EVPN] Installing prefix
192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/112
(192.168.100.101) via 192.168.50.101 label 10000 (0x0/0x0) into BRIB with
extcomm Extcommunity: RT:65000:10000 ENCAP:8
BRIB: [L2VPN EVPN]
(192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/144
(192.168.100.101)): returning from bgp_brib_add, reeval=0new_path: 1, change:
1, undelete: 0, history: 0, force: 0, (pflags=0x40002020) rnh_flag_c
BRIB: [L2VPN EVPN]
(192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/144
(192.168.100.101)): bgp_brib_add: handling nexthop, path->flags2: 0x80000
BRIB: [L2VPN EVPN] Created new path to 192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/144
via 192.168.0.101 (pflags=0x40000000, pflags2=0x0)
BRIB: [L2VPN EVPN] Installing prefix
192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/144
(192.168.100.101) via 192.168.50.101 label 10000 (0x0/0x0) into BRIB with
extcomm Extcommunity: RT:65000:10000 RT:65000:10077 ENC
Example
1-18: “sh bgp internal
event-history events | i cafĂ©” on Spine-11.
This can also be verified by checking the BGP table.
Spine-11#
sh bgp l2vpn evpn 1000.0010.cafe
BGP
routing table information for VRF default, address family L2VPN EVPN
Route
Distinguisher: 192.168.0.101:32777
BGP
routing table entry for [2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216,
version 93
Paths:
(1 available, best #1)
Flags:
(0x000202) on xmit-list, is not in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is valid, is best path
AS-Path: 65000
, path sourced external to AS
192.168.50.101 (metric 0) from
192.168.100.101 (192.168.0.101)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000
Extcommunity: RT:65000:10000 ENCAP:8
Path-id 1 advertised to peers:
192.168.100.102
BGP
routing table entry for [2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272,
version 90
Paths:
(1 available, best #1)
Flags:
(0x000202) on xmit-list, is not in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is valid, is best path
AS-Path: 65000
, path sourced external to AS
192.168.50.101 (metric 0) from
192.168.100.101 (192.168.0.101)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000 10077
Extcommunity: RT:65000:10000
RT:65000:10077 ENCAP:8 Router MAC:5e00.0000.0007
Path-id 1 advertised to peers:
192.168.100.102
Example
1-19: sh bgp l2vpn evpn
1000.0010.cafe on Spine-11.
Example 1-20 show
that Leaf-102 has received BGP Update from Spine-11. Closer examination BGP
table shows that the next-hop is Spine-11 though it should to be Leaf-101.
Leaf-102# sh bgp l2vpn evpn 1000.0010.cafe
! <------
COMMENT: BGP Adj-RIB-In information ----->
! <--- Comment: MAC-only entry
--->
BGP routing table information for
VRF default, address family L2VPN EVPN
Route Distinguisher:
192.168.0.101:32777
BGP routing table entry for
[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216, version 19
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is
not in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is
valid, is best path
Imported to 1 destination(s)
AS-Path: 65099 65000 , path sourced external to AS
192.168.100.11 (metric 0) from
192.168.100.11 (192.168.0.11)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000
Extcommunity: RT:65000:10000 ENCAP:8
Path-id 1 not advertised to any peer
! <--- Comment: MAC-IP entry --->
BGP routing table entry for
[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272, version 4
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is
not in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is
valid, is best path
Imported to 3 destination(s)
AS-Path: 65099 65000 , path sourced external to AS
192.168.100.11 (metric 0) from
192.168.100.11 (192.168.0.11)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000 10077
Extcommunity: RT:65000:10000
RT:65000:10077 ENCAP:8 Router MAC:5e00.0000.0007
Path-id 1 not advertised to any peer
! <------
COMMENT: BGP Loc-RIB information (from Adj-RIB-In)----->
Route Distinguisher:
192.168.0.102:32777 (L2VNI 10000)
BGP routing table entry for
[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216, version 20
Paths: (1 available, best #1)
Flags: (0x000212) on xmit-list, is
in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is
valid, is best path, in rib
Imported from
192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[0]:[0.0.0.0]/216
AS-Path: 65099 65000 , path sourced external to AS
192.168.100.11 (metric 0) from
192.168.100.11 (192.168.0.11)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000
Extcommunity: RT:65000:10000 ENCAP:8
Path-id 1 not advertised to any peer
BGP routing table entry for
[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272, version 5
Paths: (1 available, best #1)
Flags: (0x000212) on xmit-list, is
in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is
valid, is best path, in rib
Imported from
192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272
AS-Path: 65099 65000 , path sourced external to AS
192.168.100.11 (metric 0) from
192.168.100.11 (192.168.0.11)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000 10077
Extcommunity: RT:65000:10000
RT:65000:10077 ENCAP:8 Router MAC:5e00.0000.0007
Path-id 1 not advertised to any peer
Route Distinguisher:
192.168.0.102:3 (L3VNI 10077)
BGP routing table entry for
[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272, version 6
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is
not in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is
valid, is best path
Imported from
192.168.0.101:32777:[2]:[0]:[0]:[48]:[1000.0010.cafe]:[32]:[172.16.10.101]/272
AS-Path: 65099 65000 , path sourced external to AS
192.168.100.11 (metric 0) from
192.168.100.11 (192.168.0.11)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000 10077
Extcommunity: RT:65000:10000
RT:65000:10077 ENCAP:8 Router MAC:5e00.0000.0007
Path-id 1 not advertised to any peer
Example
1-20: show bgp l2vpn evpn
1000.0010.cafe on Leaf-102.
In addition, the “show
nve peer detail” command shows that
the NVE peering is between Leaf-102 and Spine-11 while it should be between
Leaf-102 and Leaf-101 (192.168.50.101). The reason for this is that Spine-11
changes the next-hop to its own IP address when it forwards BGP Update
originated by Leaf-101 to Leaf-102 and the NVE peer information is taken from
the next-hop field of L2VPN EVPN BGP Update.
Leaf-102# sh nve peers detail
Details of nve
Peers:
----------------------------------------
Peer-Ip: 192.168.100.11
NVE Interface : nve1
Peer State : Up
Peer Uptime : 00:31:36
Router-Mac : 5e00.0000.0007
Peer First VNI : 10000
Time since Create : 00:31:36
Configured VNIs : 10000,10077
Provision State : peer-add-complete
Learnt CP VNIs : 10000,10077
vni assignment mode : SYMMETRIC
Peer Location : N/A
Example
1-21: show nve peers detail on
Leaf-102.
This means that there is no L2/L3
connectivity between host Café and host Abba as can be seen from example 1-22.
Cafe#ping 172.16.10.102
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.16.10.102, timeout is 2
seconds:
.....
Success rate is 0
percent (0/5)
Example
1-22: ping from host Café to host Abba.
Capture 1-1 is taken from the link between Leaf-101 and
Spine-11 while host Café tries to ping host Abba. First, since the hosts are in
same subnet 172.16.10.0/24, host Cafe has to resolve the MAC address of host
Abba. It sends an ARP request (L2 broadcast).
Ethernet
II, Src: 10:00:00:10:ca:fe, Dst: ff:ff:ff:ff:ff:ff
Address
Resolution Protocol (request)
Hardware type: Ethernet (1)
Protocol type: IPv4 (0x0800)
Hardware size: 6
Protocol size: 4
Opcode: request (1)
Sender MAC address: 10:00:00:10:ca:fe
Sender IP address: 172.16.10.101
Target MAC address: 00:00:00:00:00:00
Target IP address: 172.16.10.102
Capture
1-1: ARP request from host Cafe.
ARP suppression is implemented in vni
10000 in both Leaf switches. Since Leaf-101 knows the MAC address of host Abba (learned
via BGP), it replies to ARP request by sending an ARP Reply as a unicast
straight to host Café.
Ethernet
II, Src: 10:00:00:10:ab:ba, Dst: 10:00:00:10:ca:fe
Address
Resolution Protocol (reply)
Hardware type: Ethernet (1)
Protocol type: IPv4 (0x0800)
Hardware size: 6
Protocol size: 4
Opcode: reply (2)
Sender MAC address: 10:00:00:10:ab:ba
Sender IP address: 172.16.10.102
Target MAC address: 10:00:00:10:ca:fe
Target IP address: 172.16.10.101
Capture
1-2: ARP reply from Leaf-101.
Now host Cafe has resolved the MAC/IP of host Abba and it sends
an ICMP request towards host Abba. Leaf-101 receives the ICPM request and make a
routing decision based on L2 RIB, where
the next-hop incorrectly points to Spine-11 (example 1-23).
Leaf-101#
sh l2route mac all
Flags
-(Rmac):Router MAC (Stt):Static
(L):Local (R):Remote (V):vPC link
(Dup):Duplicate
(Spl):Split (Rcv):Recv (AD):Auto-Delete (D):Del Pending
(S):Stale (C):Clear,
(Ps):Peer Sync (O):Re-Originated (Nho):NH-Override
(Pf):Permanently-Frozen
Topology Mac Address Prod
Flags Seq No Next-Hops
-----------
-------------- ------ ------------- ---------- ----------------
10 1000.0010.abba BGP SplRcv
0 192.168.100.11
10 1000.0010.cafe Local L,
0 Eth1/3
77 5e00.0002.0007 VXLAN Rmac
0 192.168.100.11
Example
1-23: ping from host Café to host Abba.
Leaf-101 encapsulates the frame and sets the outer
destination IP to 192.168.100.11 (Capture 1-3). When Spine-11 receives the packet,
it does not have any idea what to do with it and it rejects the packet.
Ethernet
II, Src: 1e:af:01:01:1e:11, Dst: c0:8e:00:11:1e:11
Internet
Protocol Version 4, Src: 192.168.50.101, Dst:
192.168.100.11
User
Datagram Protocol, Src Port: 54810, Dst Port: 4789
Virtual
eXtensible Local Area Network
Flags: 0x0800, VXLAN Network ID (VNI)
Group Policy ID: 0
VXLAN Network Identifier (VNI): 10000
Reserved: 0
Ethernet
II, Src: Private_10:ca:fe (10:00:00:10:ca:fe), Dst: Private_10:ab:ba
(10:00:00:10:ab:ba)
Internet
Protocol Version 4, Src: 172.16.10.101, Dst: 172.16.10.102
Internet
Control Message Protocol
Capture
1-3: forwarded frame by Leaf-101.
Figure 1-4: ICMP process.
In order to fix this, Spine-11 has to send L2VPN EVPN BGP
Updates without modifying the Next-Hop Path Attribute. First, there is a
route-map that prevents the next-hop
modification. This route map is then taken into action.
route-map DO-NOT-MODIFY-NH permit 10
set ip next-hop
unchanged
!
router bgp 65099
router-id 192.168.0.11
address-family ipv4 unicast
network 192.168.100.11/32
network 192.168.238.0/29
address-family l2vpn evpn
nexthop
route-map DO-NOT-MODIFY-NH
retain route-target all
neighbor 10.101.11.101
remote-as 65000
description ** BGP Underlay to Leaf-101 **
address-family ipv4 unicast
disable-peer-as-check
neighbor 10.102.11.102
remote-as 65000
description ** BGP Underlay to Leaf-102 **
address-family ipv4 unicast
disable-peer-as-check
neighbor 192.168.100.101
remote-as 65000
description ** BGP Overlay to Leaf-101 **
update-source loopback100
ebgp-multihop 2
address-family l2vpn evpn
disable-peer-as-check
send-community
send-community extended
route-map DO-NOT-MODIFY-NH out
neighbor 192.168.100.102
remote-as 65000
description ** BGP Overlay to Leaf-102 **
update-source loopback100
ebgp-multihop 2
address-family l2vpn evpn
disable-peer-as-check
send-community
send-community extended
route-map DO-NOT-MODIFY-NH out
Example
1-24: Spine-11 bgp final
configuration.
Now, Leaf-101 learns MAC/IP routes with correct next-hop information.
Leaf-101# sh bgp l2vpn evpn
1000.0010.abba
BGP routing table information for
VRF default, address family L2VPN EVPN
Route Distinguisher:
192.168.0.101:32777 (L2VNI 10000)
BGP routing table entry for
[2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216, version 395
Paths: (1 available, best #1)
Flags: (0x000212) on xmit-list, is
in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is
valid, is best path, in rib
Imported from
192.168.0.102:32777:[2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216
AS-Path: 65099 65000 , path sourced external to AS
192.168.50.102 (metric 0) from 192.168.100.11 (192.168.0.11)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000
Extcommunity: RT:65000:10000 ENCAP:8
Path-id 1 not advertised to any peer
BGP routing table entry for
[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.102]/272, version 369
Paths: (1 available, best #1)
Flags: (0x000212) on xmit-list, is
in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is
valid, is best path, in rib
Imported from
192.168.0.102:32777:[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.102]/272
AS-Path: 65099 65000 , path sourced external to AS
192.168.50.102 (metric 0) from 192.168.100.11 (192.168.0.11)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000 10077
Extcommunity: RT:65000:10000
RT:65000:10077 ENCAP:8 Router MAC:5e00.0002.0007
Path-id 1 not advertised to any peer
Route Distinguisher:
192.168.0.102:32777
BGP routing table entry for
[2]:[0]:[0]:[48]:[1000.0010.abba]:[0]:[0.0.0.0]/216, version 394
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is
not in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is
valid, is best path
Imported to 1 destination(s)
AS-Path: 65099 65000 , path sourced external to AS
192.168.50.102 (metric 0) from 192.168.100.11 (192.168.0.11)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000
Extcommunity: RT:65000:10000 ENCAP:8
Path-id 1 not advertised to any peer
BGP routing table entry for
[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.102]/272, version 367
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is
not in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is
valid, is best path
Imported to 3 destination(s)
AS-Path: 65099 65000 , path sourced external to AS
192.168.50.102 (metric 0) from 192.168.100.11 (192.168.0.11)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000 10077
Extcommunity: RT:65000:10000
RT:65000:10077 ENCAP:8 Router MAC:5e00.0002.0007
Path-id 1 not advertised to any peer
Route Distinguisher:
192.168.0.101:3 (L3VNI 10077)
BGP routing table entry for
[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.102]/272, version 370
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is
not in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: external, path is
valid, is best path
Imported from
192.168.0.102:32777:[2]:[0]:[0]:[48]:[1000.0010.abba]:[32]:[172.16.10.102]/272
AS-Path: 65099 65000 , path sourced external to AS
192.168.50.102 (metric 0) from 192.168.100.11 (192.168.0.11)
Origin IGP, MED not set, localpref 100,
weight 0
Received label 10000 10077
Extcommunity: RT:65000:10000
RT:65000:10077 ENCAP:8 Router MAC:5e00.0002.0007
Path-id 1 not advertised to any peer
Example
1-25: BGP table on Leaf-101
concerning host Abba.
The NVE peer information (example 1-26), as well as L2
routing information is L2 RIB (example
1-27) are correct.
Leaf-102#
sh nve peer detail
Details
of nve Peers:
----------------------------------------
Peer-Ip:
192.168.50.101
NVE Interface : nve1
Peer State : Up
Peer Uptime : 01:03:58
Router-Mac : 5e00.0000.0007
Peer First VNI : 10000
Time since Create : 01:03:58
Configured VNIs : 10000,10077
Provision State : peer-add-complete
Learnt CP VNIs : 10000,10077
vni assignment mode : SYMMETRIC
Peer Location : N/A
Example
1-26: NVE peer information on
Leaf-102.
Leaf-101#
sh l2route mac all
Flags
-(Rmac):Router MAC (Stt):Static
(L):Local (R):Remote (V):vPC link
(Dup):Duplicate
(Spl):Split (Rcv):Recv (AD):Auto-Delete (D):Del Pending
(S):Stale (C):Clear,
(Ps):Peer Sync (O):Re-Originated (Nho):NH-Override
(Pf):Permanently-Frozen
Topology Mac Address Prod
Flags Seq No
Next-Hops
-----------
-------------- ------ ------------- ---------- ----------------
10 1000.0010.abba BGP SplRcv 0 192.168.50.102
10 1000.0010.cafe Local L,
0 Eth1/3
77 5e00.0002.0007 VXLAN Rmac
0 192.168.50.102
Example
1-27: L2 RIB on Leaf-101.
Host Cafe is now able to ping host Abba.
Cafe#ping
172.16.10.102
Type
escape sequence to abort.
Sending
5, 100-byte ICMP Echos to 172.16.10.102, timeout is 2 seconds:
!!!!!
Success
rate is 100 percent (5/5), round-trip min/avg/max = 19/25/33 ms
Example
1-28: ping from host Cafe to host
Abba.
As a final verification, capture 1-4 shows that ICMP
packets are sent inside VXLAN encapsulation to 192.168.100.102.
Ethernet
II, Src: 1e:af:01:01:1e:11, Dst: c0:8e:00:11:1e:11
IPv4,
Src: 192.168.50.101, Dst: 192.168.50.102
User
Datagram Protocol, Src Port: 59959, Dst Port: 4789
Virtual
eXtensible Local Area Network
Flags: 0x0800, VXLAN Network ID (VNI)
Group Policy ID: 0
VXLAN Network Identifier (VNI): 10000
Reserved: 0
Ethernet
II, Src: 10:00:00:10:ca:fe, Dst: 10:00:00:10:ab:ba
IPv4,
Src: 172.16.10.101, Dst: 172.16.10.102
Internet
Control Message Protocol
Capture
1-4: VXLAN encapsulated ICMP packets.
Author: Toni Pasanen
CCIE#28158
Published: 3.5.2019
Updated:
-------------------------------------------------
References:
Building
Data Center with VXLAN BGP EVPN – A Cisco NX-OS Perspective
ISBN-10:
1-58714-467-0 – Krattiger Lukas, Shyam Kapadia, and Jansen Davis
hi Toni,
ReplyDeleteplease allow me to ask one question here:
on spine switches command "retain route-target" is used as Spine11 does now have any vrf configured so L2VPN EVPN NLRIs installed into BGP table or RIB.
while when using BGP or ospf as underlay network protocol, we do not use this command and in these cases Spine 11 does not have EVPN instance as well. why do spine11 in these case forward EVPN NLRIs from leaf101 to Leaf102 without command "retain route-target"?
Regards
Michael
I just changed the sentence. Does it now make more sence?
Deleteit makes perfect sentences now !
DeleteThanks Toni.
Michael