Forewords
This article introduces
the principles of the Amazon Web Service Virtual Private Cloud (AWS VPC)
Control-Plane operation and Data-Plane encapsulation. Also, this document explains
how the same kind of forwarding model can be achieved using standard protocols.
Amazon has not published details of its VPC networking solution, and this
document relies on publically available information and the author’s studies.
The motivation for writing this document was that I wanted to point out that no
matter how simple and easy to manage Cloud Networking looks and feels like,
those still are as complex as any other large scale networks.
Example Environment
Figure 1-1 illustrates
an example AWS VPC environment running on an imaginary application on two Elastic
Cloud Computing (EC2) Instances, EC2-A and EC2-B. The instance EC2-A will be
launched in physical server Host-A while the instance EC2-B will later be launched
in physical server Host-B. The VPC vpc-1a2b3c4d is created in Stockholm
(eu-north-1) Region in Availability Zone (AZ) eu-north-1c. The subnet 172.16.31.0/20
can be used in AZ eu-north-1c. The subnet for instances is 172.31.10.0/24.
Elastic Network Interface-1 (ENI1) with IP address 172.31.10.10 will be attached
to the instance EC2-A and ENI2 with IP address 172.31.10.20 will be attached to
the instance EC2-B. For simplicity, the same Security Group (SG) “sg-nwktimes”,
allowing all data traffic between EC2-A and EC2-B) is attached to both instances.
Inside both physical
servers, there is a software router, Router-1 in Host-A and Router-2 in Host-B.
Servers use offload NICs for connection to AZ Underlay Network and data traffic
from instances is sent out of the server straight to offload NIC bypassing the
hypervisor. The AZ Backbone includes three routers, Router-3, Router-4, and
Router-5. Also, there is a Mapping Service that represents the centralized
Control Plane. It holds an
Instance-to-Location Mapping Database that has information about every
EC2 Instances running on a given VPC. Routers, servers and Mapping Service use
IPv6 addressing.
Figure 1-1: Overall example topology and IP addressing scheme.
Control-Plane: Instance EC2-A Registration Process by Host-A
The processes of how
the Mapping Service gets Instance-to-Location mapping information is not
published by Amazon. However, there has to be some kind of registration process,
where the hosting server registers its instances to Mapping Service. Mapping
Service needs to know at least the following instance-specific information: (A)
the physical location-server, (B) the IP address and MAC address information,
(C) the ENI to which IP/MAC addresses are bind to, and the VPC (tenant
information) where the instance belongs to. Figure 1-2 illustrates the presumed
mapping entry information related to instance EC2-A in the Mapping Service database.
The registration information can be sent over the Underlay Network or there might
be some dedicated Control-Plane channel for registration messages.
Figure 1-2: Instance EC2-A Registration Process by Host-A.
Control-Plane: Instance EC2-B Registration Process by Host-B
The same registration processes
happen when instance EC2-A is launched in Host-B. The Mapping Service now knows
the location of both EC2 instances.
Figure 1-3: Instance EC2-B Registration Process by Host-B.
Control-Plane: Registration Information Distribution Process by Mapping Service
The public AWS VPC documentation states that the
Instance-to-Location (AWS might not use this term) Mapping Service is
responsible for publishing the information only to those servers that are
hosting the particular VPC. In our example, this means that the Mapping Service
will publish the reachability information of the instance EC2-A to Host-B and the
reachability information of the instance EC2-B to Host-A. Figure 1-4 illustrates
the Instance-to-Location publishing process by Mapping Service. There is no
public documentation of how this has been done. We, however, can assume that
the process is not data-driven, meaning that the information is not requested
by servers only when there is data to destination hosted by a remote server but
published by Mapping Service in push-model (without requested by servers).
Control-Plane: Similarities with LISP Based EID Registration Process
The registration process of EC2
instance in AWS VPC and EID-to-RLOC in Locator/ID Separation Protocol (LISP)
based network, such as Cisco’s SD-Access/Campus Fabric, has a similar operating
model. Figure 1-5 explains the LISP model using same the same kind of example
topology than our AWS VPC example. When Edge Sw-1 in Building-A learns the
Host-B MAC/IP address information (from GARP/ARP/ingress data) it generates the
LISP Map-Register message and sends it to Mapping Server (a component in
Mapping System). In our example the message carries Mapping Records which describe (A) the LISP Instance-Id (Tenant
Identifier) where the host is attached to, (B) The host IP address and (C) how
long the mapping information is valid. The Location information
(last-hop-router) is encoded into Locator
Record attached to Mapping Record. Among the location description
information, the LISP Map-Register message carries other information but those
are out of the scope of this document. As a summary, the registration of end
hosts looks similar in both AWS VPC and LISP based networks.
Figure 1-5: Registration Process By Edge Sw-2.
Control-Plane: Registration Information Distribution Process by Mapping System in LISP
While the end-point registration
process in AWS VPC has similarities compared to LISP end-point registration
process, the process of how the location information concerning Intra-Subnet
hosts are made available to edge devices is different. Edge devices (ingress
Tunnel Routers - iTRs) in the LISP network use LISP Map-Request messages
(figure 1-6) where iTRs ask EID-to-RLOC
information from Mapping Resolver (a component in Mapping System) while the
Mapping Service in AWS VPC distributes the location information for all servers
that have active EC2 instances in that particular VPC up and running without
request mechanism. In this sense, the AWS VPC solution has some similarities
with BGP EVPN based networks (mostly used in Datacenters) where VTEP switches
advertise the NLRI of connected hosts to BGP EVPN Route-Reflector (usually
Spine switches), which in turn advertise these to Route-Reflector Clients
(other VTEP switches). The NLRI import policy is based on Route-Target (BGP
Extended Community Path-Attribute). Nonetheless, there is no Mapping
System/Service used in BGP EVPN based solution. The Map-Request process is shown
in figure 1-6. Host-B sends data to Host-A (ARP process ignored for simplicity) . When Edge Sw-2 receives the data,
it checks the local Mapping-Cache (MC) and because there is no EID-to-RLOC
information yet in the MC, it sends a Map-Request message to Map-Responder. Map-Responder
checks from the Map-Server EID-to-RLOC mapping database if mapping information
is available. Check is hit and Map-Resolver responds to Edge Sw-2 with The map-Reply message where it describes the location of Host-A. Edge Sw-2 stores
the information in its local mapping-cache and now it can send data to Host-A.
Figure 1-6: LISP Map-Request Process By Edge Sw-2.
In theory, the same kind of distribution
mechanism, where endpoints are registered to Mapping-Service and where the
mapping information is then distributed to other edge switches without a separate
request process can be built with LISP. LISP routers can be redistributed from
the EID-to-RLOC database to BGP by Mapping Server and advertised to all BGP peers.
This solution is used when EID-prefixes are advertised from the LISP site to
non-LISP sites via Proxy Tunnel Routers (PxTR). Figure 1-7 illustrates how this
solution can be used to redistribute end-point information to Edge-Switches. The
problem with this approach is that the CP advertises itself as a next-hop for redistributed
routes. This in turns means that the CP will be in data-path that is not a preferred
model. The question is, should we do everything possible? My opinion is that
even though LISP is capable of doing this, it is a way too complex solution and
there is no sense to do it. Also, whether or not it works depends on a vendor
device and code running on it. some devices do the RIB lookup first and then
the LISP Mapping-Cache lookup, while some devices work the opposite way. If the
EID location information is received via LISP and BGP it does matter which
table is first checked. If the LISP Mapping-Cache is verified first and the
lookup result is negative, it triggers the LISP Map-Request process.
Figure 1-7: LISP to BGP redistribution By Mapping Server.
Control-Plane: Conclusion
While the registration process used
in AWS VPC and LISP based networks has similarities the method of how the
mapping information is distributed to Edge devices (switches/servers) is
different. In AWS VPC the endpoint location is registered to Mapping Service, from
where it is distributed to servers. In LISP based model the EID is registered
to Mapping Server and then requested by iTR when needed.
Data-Plane: AWS VPC - VPC tunneling
There is no detailed public information
available about the AWS VPC Data-Plane encapsulation process. The AWS VPC
public documentation, however, states that the VPC Tag is carried within the
VPC header. Besides, we can assume that the information about Elastic Network
Interface (ENI) is also included in the VPC header. The instance Identifier
might be carried within the VPC tunnel header but this assumption is not based
on any documentation. Note, some documentation states that the Underlay Network
on AWS Availability Zones uses IPv6 addressing. Figure
1-8 illustrates the event where the Instance EC2-A on Host-A sends an ICMP-Request
message to instance EC2-B running on Host-B. At this phase, all the instance
registration information distribution and ARP processes by an instance are
done. When the Soft-Switch receives the ICMP-Request message to a non-local
destination, it does location cache lookup to find out forwarding information.
It constructs the tunnel header by using the IP address of Host-B
(2001:db8:0:100:2) as a destination IPv6 address in the outer IPv6 header and
its IPv6 address (2001:db8:0:100:1) as a source IPv6 address. It then adds the protocol
and port information into the Transport layer header. Also, there is a VPC
header that is used for transport instance-specific information. VPC-Id/Tag
identifies the tenant where instance belongs to and the ENI defines the
location inside the hosting server. The original ICMP message is wrapped inside
tunnel headers and then the message is sent to backbone routers. Backbone
routers forward the packet based on the outer IPv6 header information.
Figure 1-8: VPC Encapsulation Process.
Data-Plane: LISP Fabric - VXLAN tunneling
Figure 1-9 illustrates the same
process where Host-A sends an ICMP-request message to Host-B. Edge Sw-1
receives the message and encapsulates it based on the information found from
the local LISP Mapping-Cache. It sets the IP address of last-hop-router Edge
Sw-2 (192.168.10.2) as a destination IP address in outer IP headers and it uses
its IP address (192.168.10.1) as a source address. VXLAN uses UDP in Transport
Layer where the destination UDP port 4789 indicates that the next header is a
VXAN header. In this example, the VXLAN header has also a Group-Based Policy
(GBP) extension header which carries the Scalable Group Tag (SGT) information
defined in the Group Policy-Id field. The original ICMP request is wrapped
inside these tunnel headers. After the encapsulation process, Edge Sw-1 sends
the encapsulated ICMP-request message to backbone routers which makes there
forwarding decision-based on the destination IP address in the outer IP header.
Figure 1-9: VPC Encapsulation Process.
Endpoint Security: Securit Group and Scalable Group Tag
The left-hand side in figure 1-10
represents the AWS VPC Availability Zone and the right-hand side represents
Campus Fabric.
When an EC2 instance is launched in
a host, you can attach either default or modified Security Group (SG) to it. In
our example, there is an SG sg-nwktimes attached to instance EC2-B, which
permits inbound HTTP connections from instance EC2-B (172.31.10.2) only. Also,
there is an outbound rule, which permits HTTP connection to destinations that
belongs to Security Group sg-website. These rules are stateful, meaning that
return traffic is allowed for both inbound and outbound connections. This
solution support both macro segmentation and micro-segmentation (segmentation
within a subnet). This information is not carried within VPC encapsulated data
packets.
In Campus fabric, you can use Scalable
Group Tag (SGT) for macro and micro-segmentation. The Campus Fabric example
shows the SGT matrix where the vertical SGTs represents source SGTs while the
horizontal SGTs defines the destination SGTs. SGT 77 is attached to Host-A, and
all traffic is allowed between hosts that have this same SGT 77 as their
identifier. Besides, HTTP/HTTPS connections from Host-A with SGT 77 are allowed
to hosts that are identified with SGT 22. Note that the SGT information is
carried within the VXLAN header in the Group Polic-Id field. SGT based
segmentation model is stateless by nature and traffic has to be allowed in both
directions separately.
References
[RFC
6830] D. Farinacci et
al., “The Locator/ID Separation Protocol (LISP)”, RFC 6830, January 2013.
[RFC
6833] V. Fuller and D.
Farinacci., “Locator/ID Separation Protocol (LISP) Map-Server Interface”, RFC
6833, January 2013.
[LISP Control Plane] D.
Farinacci et al., “The Locator/ID Separation Protocol (LISP) Control Plane”,
draft-ietf-lisp-rdc6833bis-25, June 16, 2019.
What Is Amazon VPC?
https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html
AWS re: Invent 2015 | (NET403) Another Day, Another Billion
Packets
https://www.youtube.com/watch?v=3qln2u1Vr2E
AWS re: Invent 2017: Another Day, Another Billion Flows (NET405)
https://www.youtube.com/watch?v=8gc2DgBqo9U
LISP
Control-Plane in Campus Fabric
A Practical
Guide to Understand the Operation of Campus Fabric
Toni Pasanen, 17
February 2020,
ISBN-13:
979-8615059186
Those guidelines additionally worked to become a good way to recognize that other people online have identical fervor like mine to grasp a great deal more around this condition. and I could assume you are an expert on this subject. Same as your blog i found another one Amazon Master Class .Actually I was looking for the same information on internet for Amazon Master Class and came across your blog. I am impressed by the information that you have on this blog. Thanks a million and please keep up the gratifying work.
ReplyDeleteI haven't have time to write for a while but I have ideas for new post series. Stay in tune :)
DeleteHi Toni,
ReplyDeleteI just finished watching the AWS re: Invent 2017: Another Day, Another Billion Flows (NET405) video and straight away I though 'wait a minute, I see some similarities with LISP on a high level'. Then I googled aws vpc lisp, and I ended up reading your post :-) Good stuff as always