Friday, 19 January 2024

BGP EVPN Part III: BGP EVPN Local Learning Fundamentals

Multi-Protocol BGP (MP-BGP) is a BGP-4 extension that enables BGP speakers to encode Network Layer Reachability Information (NLRI) of various address types, such as IPv4/6, VPNv4, and MAC addresses, into BGP Update messages. MP-BGP features an MP_REACH_NLRI Path-Attribute (PA), which utilizes an Address Family Identifier (AFI) to describe service categories. Subsequent Address Family Identifier (SAFI), in turn, defines the solution used for providing the service. For example, L2VPN (AFI 25) is a primary category for Layer-2 VPN services, and the Ethernet Virtual Private Network (EVPN: SAFI 70) provides the service. Another L2VPN service is Virtual Private LAN Service (VPLS: SAFI 65). The main differences between these two L2VPN services are that only EVPN supports active/active multihoming, has a control-plane-based MAC address learning mechanism, and operates over an IP-routed infrastructure.

EVPN utilizes various Route Types (EVPN RT) to describe the Network Layer Reachability Information (NLRI) associated with Unicast, BUM (Broadcast, Unknown unicast, and Multicast) traffic, as well as ESI Multihoming. The following sections explain how EVPN RT 2 (MAC Advertisement Route) is employed to distribute MAC and IP address information of Tenant Systems enabling the expansion of VLAN over routed infrastructure. 

The Tenant System refers to a host, virtual machine, or an intra-tenant forwarding component attached to one or more Tenant-specific Virtual Networks. Examples of TS forwarding components include firewalls, load balancers, switches, and routers.

TS’s MAC Address Local Processing - Basics

Switch Leaf-101, in Figures 1-2, serves as our Virtual Tunnel Endpoint (VTEP) device, supporting the BGP L2VPN EVPN address family. By configuring an EVPN instance (EVI) on VTEP, we deploy an instance-specific MAC-VRF, a Virtual Routing and Forwarding table for MAC addresses (L2RIB). Each EVI is identified by a Layer-2 Virtual Network Identifier (L2VNI). In the VXLAN header on the data plane, a remote VTEP utilizes L2VNI to describe the EVPN instance to the local VTEP. Besides L2VNI, each EVI has a unique Route Target (RT), an extended community path attribute for import/export policies, and a Route Distinguisher (RD) to facilitate inter-tenant overlapping MAC and IP addresses. 

In Figure 1-2, we have an EVPN Instance (MAC-VRF) that we have given an L2VNI 10000. The RT associated with EVI is AS-specific with auto-derived service-Id (L2VNI) as Local Administrator (AS:L2VNI = 65000:10000). The attached RD, in turn, is auto-derived from the IP address bound to interface NVE1 and VLAN-Id added to the base number 32767 (192.168.100.101:32777).

Tenant System (TS-A1) in Figure 1-2 is connected to VLAN 10 via an Attachment Circuit (AC), Ethernet 1/2 (an AC can be either a physical or logical interface). Leaf-101, serving as a VTEP device, must employ local MAC address learning from the data plane. For instance, when TS-A1 goes live, it may validate the uniqueness of its IP address using Gratuitous ARP (GARP) to check if its IP address has already been assigned to another TS. Leaf-101 then records the source MAC address from the ingress GARP message's Ethernet header into the Bridge Table of VLAN 10. The Next-Hop for the MAC address is AC E1/2. In addition to saving the information into the Bridge Table, Leaf-101 initiates the MAC entry aging timer, which defaults to 1800 seconds (30 minutes).

Once the MAC address information is saved in the Bridge Table, the Layer-2 Forwarder (L2FWDR) component replicates this information into the EVI’s L2RIB  as a Local route. Mapping VLAN to EVI is done under the Layer 2 VLAN configuration. We are using the VLAN-Based Service Interface solution with a one-to-one mapping between VLAN and EVI. So, there is only one VLAN per EVI. 

In addition to MAC address information, the VTEP device, Leaf-101, learns IP address details from the ingress GARP message sent by TS-A1 and stores this information in the tenant-specific ARP table. It is important to note that the tenant must have the VRF Context (IP-VRF) configured. Furthermore, a Layer 3 VLAN interface is necessary for VLAN (Broadcast Domain). In the absence of a Layer 3 VLAN, there will be no ARP table. The Host Mobility Manager (HMM) detects the ARP table update event and subsequently replicates the ARP entry to the tenant-specific local host database. Following this, the HMM updates the L2RIB by binding an IP address to a MAC address and adding information about the L3VNI assigned to IP-VRF to the TS-A1 MAC address entry.

HMM also encodes the IP address of TS-A1 (192.168.11.11/32) into the IP-VRF as a tenant-specific local IP route. It's important to note that host routes are utilized for inter-VN routing and are not advertised as an IP Prefix Route (EVPN RT 5).

Next, Leaf-101 constructs a BGP routing entry in the BGP BGP-Loc RIB describing an NLRI related to TS-A1. Two BGP Route Target (RT) path attributes are assigned to the BGP routing table entry: RT65000:10000 (for MAC-VRF) for MAC addresses and RT65000:10077 (for IP-VRF) for IP addresses. By utilizing these RTs, remote VTEP devices can import the received EVPN NLRI into BGP tables. 

EVI instance with L2VNI 10000 is a member VNI for the interface NVE1. The primary task of the NVE1 interface is to encapsulate egress packets with VXLAN headers and remove encapsulation from ingress frames/packets. The BGP Extended Community Encapsulation type VXLAN (type 8) is configured based on the interface NVE1 settings.

The system MAC, in turn, belongs to VTEP device Leaf-101. But why do we need a system MAC address extended community? VXLAN is a “MAC in IP/UDP” encapsulation model. Since inter-VN packets, routed over an IRB service interface, do not utilize the MAC address of the Tenant System, the NVE’s system MAC address is used as the source MAC address in the inner Ethernet header. Therefore, the IRB service interface must have a Layer 2 VLAN and a non-IP address Layer 3 VLAN interface. This configuration allows the local VTEP to use the system MAC address as the source in the inner Ethernet frame. The remote VTEP, in turn, uses it as a destination MAC address in the inner Ethernet frame.

The MP_REACH_NLRI Path Attribute carries the EVPN Network Layer Reachability Information (NLRI) about TS-A1. The Next-Hop is the IP address of an interface NVE1. EVPN Route Type “MAC Advertisement Route” (RT 2). The Route Distinguisher global administrator is the BGP Router-Id 192.168.77.101, and the local administrator is 32777 (RD: 192.168.77.101:32777). MAC and IP addresses, as well as VNIs, are taken from the L2RIB. From the BGP Loc-RIB, the NLRI information is sent through the BGP policy engine to BGP Adj-RIB-In and sent to BGP peers, Spine switches, which, after processing the BGP received BGP update message, forward information to other VTEP switches.

The upcoming post details the local VTEP operation and explains how the remote VTEP processes the received BGP Update message.


Figure 1-2: BGP EVPN with VXLAN –Building Blocks.

No comments:

Post a Comment