Introduction
VXLAN is MAC-over-IP / UDP tunneling mechanism that allows the Layer2 segments to be stretched over the Layer3 network (Underlay/Transport). In this chapter, I will show one possible design of the Underlay network. I will also show basic configurations and monitor commands. At the end of this article, you can find a mindmap for memory builder.
Our example network consists of four Cisco Nexus 9000 switches. The edge switches Leaf-101 and Leaf-102 works as a VTEP (VXLAN Tunnel Endpoint) devices. VTEPs are responsible for encapsulation of Ethernet frames received from directly connected hosts with VXLAN header as well as removing VXLAN header from the packet received from another VTEP switch. Spine-11 and Spine-12 are the core switches. These switches are not aware of hosts/VMs behind the VTEP Leaf switches, Spine switches only route packet between VTEP switches.
Figure-1: Example topology
Routing protocols:
Routing protocols can be divided into three main groups; 1) Hop-count (RIP), 2) Link-State (IS-IS, OSPF) and 3) Vector Based Protocols (EIGRP: distance-vector and BGP: path-vector). Link-State protocols calculate the best loop-free path through the network by using SPF algorithm. Link-State protocols observe the link speed when calculating the best path. Link-State protocols also support load sharing with equal cost links (ECMP). When using the Link State protocol, each router in the routing area has unified information about network topology, while EIGRP and RIP believe what neighbor router tells them (routing by the rumor). BGP is often used in an Underlay network, but unlike Link-State protocols, its route selection is based on path attributes such as AS-path length, it does not consider link speeds when selecting the best path. For these reasons, I have chosen OSPF for Underlay routing (and I know it better than IS-IS).
IP addressing
Inter-switch link:
All links between switches are Point-to-Point (P2P) links. It is common practice to use network mask / 30 or / 31 on P2P links. Instead of using dedicated sub-network between switches, I am going to use an unnumbered IP-addressing scheme where link addresses are borrowed from the Loopback 0 interface.
All links between switches are Point-to-Point (P2P) links. It is common practice to use network mask / 30 or / 31 on P2P links. Instead of using dedicated sub-network between switches, I am going to use an unnumbered IP-addressing scheme where link addresses are borrowed from the Loopback 0 interface.
Loopback 0:
As already mentioned, Inter-switch links borrow the Loopback 0 ip address. Loopback0 is also used for underlay routing and as an OSPF RID.
As already mentioned, Inter-switch links borrow the Loopback 0 ip address. Loopback0 is also used for underlay routing and as an OSPF RID.
Loopback 100:
Is used as a VTEP address. We could use the Loopback 0 address for both RID and VTEP address but by using dedicated VTEP IP-address, we can remove the Leaf switch from the VXLAN domain by shutting down the Loopback 100. In this way, we can remove the switch from the VXLAN domain without removing it from the Underlay network and we can investigate possible problems in the underlay network without disturbing server traffic.
Is used as a VTEP address. We could use the Loopback 0 address for both RID and VTEP address but by using dedicated VTEP IP-address, we can remove the Leaf switch from the VXLAN domain by shutting down the Loopback 100. In this way, we can remove the switch from the VXLAN domain without removing it from the Underlay network and we can investigate possible problems in the underlay network without disturbing server traffic.
Configuration examples
Note that “ip host” configurations in line four to eight are optional as well the last line “name-lookup” under the OSPF configuration. By using those optional commands, we get VTEP names on the “show ip ospf neighbor” instead of RID IP-address.
Configuration example 1: Leaf-101.
hostname Leaf-101
|
feature ospf
|
!
|
ip host Leaf-101 192.168.0.101
|
ip host Leaf-102 192.168.0.102
|
ip host Spine-11 192.168.0.11
|
ip host Spine-12 192.168.0.12
|
!
|
interface Ethernet1/1
|
no switchport
|
medium p2p
|
ip unnumbered loopback0
|
ip ospf network point-to-point
|
ip router ospf UNDERLAY-NET area 0.0.0.0
|
no shutdown
|
interface Ethernet1/2
|
no switchport
|
medium p2p
|
ip unnumbered loopback0
|
ip ospf network point-to-point
|
ip router ospf UNDERLAY-NET area 0.0.0.0
|
no shutdown
|
interface loopback0
|
description ** RID/Underlay **
|
ip address 192.168.0.101/32
|
ip router ospf UNDERLAY-NET area 0.0.0.0
|
!
|
interface loopback100
|
description ** VTEP/Overlay **
|
ip address 192.168.100.101/32
|
ip router ospf UNDERLAY-NET area 0.0.0.0
|
!
|
router ospf UNDERLAY-NET
|
router-id 192.168.0.101
|
name-lookup
|
Configuration example 2: Spine-11.
hostname Spine-11
|
feature ospf
|
ip host Leaf-101 192.168.0.101
|
ip host Spine-12 192.168.0.12
|
ip host Spine-11 192.168.0.11
|
ip host Leaf-102 192.168.0.102
|
!
|
interface Ethernet1/1
|
no switchport
|
medium p2p
|
ip unnumbered loopback0
|
ip ospf network point-to-point
|
ip router ospf UNDERLAY-NET area 0.0.0.0
|
no shutdown
|
interface Ethernet1/2
|
no switchport
|
medium p2p
|
ip unnumbered loopback0
|
ip ospf network point-to-point
|
ip router ospf UNDERLAY-NET area 0.0.0.0
|
no shutdown
|
interface loopback0
|
description ** RID/Underlay **
|
ip address 192.168.0.11/32
|
ip router ospf UNDERLAY-NET area 0.0.0.0
|
!
|
router ospf UNDERLAY-NET
|
router-id 192.168.0.11
|
name-lookup
|
Monitoring
Show command example 1: Leaf-101 – show ip ospf neighbors.
Leaf-101#
sh ip ospf
neighbors
|
OSPF Process ID UNDERLAY-NET VRF default
|
Total number of neighbors: 2
|
Neighbor ID Pri State Up Time Address Interface
|
Spine-11 1 FULL/ - 00:04:34 192.168.0.11 Eth1/1
|
Spine-12 1 FULL/ - 00:03:24 192.168.0.12 Eth1/2
|
Show command example 2: Spine-11 – show ip ospf neighbors.
|
There are two equal costs links between the Leaf switches and OSPF will use both links. Note! ECMP load sharing is based on 5-tuple (src/dst IP, Transport protocol and src/dst ports of transport protocol). In VXLAN header, the only changing value is source UDP port number, which is calculated from the inner frame. This way the traffic flows from hosts/VMs can be differentiated and send over the different physical links.
Show command example 3: leaf-102 – show ip route ospf.
Leaf-102# sh ip
route ospf
|
IP Route Table for VRF "default"
|
'*' denotes best ucast
next-hop
|
'**' denotes best mcast next-hop
|
'[x/y]' denotes [preference/metric]
|
'%<string>' in via output denotes VRF
<string>
|
192.168.0.11/32, ubest/mbest: 1/0
|
*via
192.168.0.11, Eth1/1, [110/41], 00:05:39, ospf-UNDERLAY-NET, intra
|
192.168.0.12/32, ubest/mbest: 1/0
|
*via
192.168.0.12, Eth1/2, [110/41], 00:05:16, ospf-UNDERLAY-NET, intra
|
192.168.0.101/32, ubest/mbest: 2/0
|
*via
192.168.0.11, Eth1/1, [110/81], 00:05:16, ospf-UNDERLAY-NET, intra
|
*via
192.168.0.12, Eth1/2, [110/81], 00:05:16, ospf-UNDERLAY-NET, intra
|
192.168.100.101/32, ubest/mbest: 2/0
|
*via
192.168.0.11, Eth1/1, [110/81], 00:05:16, ospf-UNDERLAY-NET, intra
|
*via
192.168.0.12, Eth1/2, [110/81], 00:05:16, ospf-UNDERLAY-NET, intra
VXLAN Unicast Routing Mind Map
Figure-2: Mind Map
Edited: February
9.3.2018 | Toni Pasanen CCIE#28158
Next part: VXLAN
Part III. The Underlay network – Multidestination Traffic: Anycast-RP with PIM
References:
RFC
7348: Virtual eXtensible Local Area Network (VXLAN): A Framework for Overlaying
Virtualized Layer 2 Networks over Layer 3 Networks.
Building
Data Center with VXLAN BGP EVPN – A Cisco NX-OS Perspective
ISBN-10:
1-58714-467-0
|
Excellent work ...
ReplyDeleteThanks!
DeleteVery Well written . Thanks .
ReplyDeleteThanks Kamal!
DeleteAmazing.. loving this articles!!
ReplyDeleteThank you very much for you kind comment!
DeleteExcellent document. Thank you again. I read each and every line. Interesting notes
ReplyDeleteThe newest article on this site is about "VXLAN Underlay Routing - Part I: OSPF and Dijkstra/SPF algorithm". You might want to check that out too. I am currently writing a document that describes the differences between the OSPF and the IS-IS protocols from the VTEP switches perspective.
DeleteToni Pasanen
DeleteCongratulations for your excellent articles, the high level design are so helpful and explanation is so clear!
I am planning to follow every lesson including the labs,
This is a "work of a life" Thanks a lot for sharing!!!
I really appreciate your comment, big thanks!
DeleteFantastic article. I really appreciate your detailed explanations. I hope you will allow one question on one point which is quite interesting, but not clear to me. You state that different loopbacks for the underlay and the overlay allows to shut nve interface and isolate the vtep from the vxlan network without disturbing host traffic, but wouldn't that isolate all hosts attached to that vtep from the rest of the vxlan network?
ReplyDeleteHi, I agree shutting down the NVE will restrict connected hosts from the fabric. But what it does not disturb is the underlay network. I have explained Loopback interface numbering scheme and recover process in greater detail in this post: https://nwktimes.blogspot.com/2018/08/vxlan-part-x-vpc-and-gir-bgp-evpn.html
DeleteCrystal clear explanation!
ReplyDelete