Tuesday, 13 June 2023

NVA Part V: NVA Redundancy with Azure Internal Load Balancer - On-Prem Connec

 Introduction


In Chapter Five, we deployed an internal load balancer (ILB) in the vnet-hub. It was attached to the subnet 10.0.0.0/24, where it obtained the frontend IP (FIP) 10.0.1.6. Next, we created a backend pool and associated our NVAs with it. Finally, we bound the frontend IP 10.0.1.6 to the backend pool to complete the ILB setup.


Next, in vnet-spoke1, we created a route table called rt-spoke1. This route table contained a user-defined route (UDR) for 10.2.0.0/24 (vnet-spoke2) with the next-hop set as 10.0.1.6. We attached this route table to the subnet 10.1.0.0/24. Similarly, in vnet-spoke2, we implemented a user-defined route for 10.1.0.0/24 (vnet-spoke1). By configuring these UDRs, we ensured that the spoke-to-spoke traffic would pass through the ILB and one of the NVAs on vnet-hub. Note that in this design, the Virtual Network Gateway is not required for spoke-to-spoke traffic.


In this chapter, we will add a Virtual Network Gateway (VGW) into the topology and establish an IPsec VPN connection between the on-premises network edge router and VGW. Additionally, we will deploy a new route table called "rt-gw-snet" where we add routing entries to the spoke VNets with the next-hop IP address 10.0.1.6 (ILB's frontend IP). Besides, we will add a routing entry 10.3.0.0/16 > 10.0.1.6 into the existing route tables on vnet-spoke-1 and vnet-spoke-2 (not shown in figure 6-1). This configuration will ensure that the spoke to spoke and spoke to on-prem flows are directed through one of the Network Virtual Appliances (NVAs) via ILB. The NVAs use the default route table, where the VGW propagates all the routes learned from VPN peers. However, we do not propagate routes from the default route table into the "rt-gw-snet" and "rt-prod-1" route tables. To enable the spoke VNets to use the VGW on the hub VNet, we allow it in VNet peering configurations.


  1. The administrator of the mgmt-pc opens an SSH session to vm-prod-1. The connection initiation begins with the TCP three-way handshake. The TCP SYN message is transmitted over the VPN connection to the Virtual Gateway (VGW) located on the vnet-hub. Upon receiving the message, the VGW first decrypts it and performs a routing lookup. The destination IP address, 10.1.0.4, matches the highlighted routing entry in the route table rt-gw-snet.
  2. The VGW determines the location (the IP address of the hosting server) of 10.1.0.6, encapsulates the message with tunnel headers, and forwards it to an Internal Load Balancer (ILB) using the destination IP address 10.1.0.6 in the tunnel header.
  3. The Internal Load Balancer receives the TCP SYN message. As the destination IP address in the tunnel header matches one of its frontend IPs, the ILB decapsulates the packet. It then checks which backend pool (BEP) is associated with the frontend IP (FIP) 10.0.1.6 to determine to which VMs it can forward the TCP SYN message. Using a hash algorithm (in our example, the 5-tuple), the ILB selects a VM from the backend pool members, in this case, NVA2. The ILB performs a location lookup for the IP address 10.1.0.5, encapsulates the TCP SYN message with tunnel headers, and finally sends it to NVA2.
  4. The message reaches the hosting server of NVA2, which removes the encapsulation since the destination IP in the tunnel header belongs to itself. Based on the Syn flag set in the TCP header, the packet is identified as the first packet of the flow. Since this is the initial packet of the flow, there is no flow entry programmed into the Generic Flow Table (GFT) related to this connection. The parser component generates a metadata file from the L3 and L4 headers of the message, which then is processed by the Virtual Filtering Platform (VFP) layers associated with NVA2. Following the VFP processing, the TCP SYN message is passed to NVA2, and the GFT is updated with flow information and associated actions (Allow and Encapsulation instructions). Besides, the VFP process creates a corresponding entry for the return packets into the GFT (reversed source and destination IPs and ports). Please refer to the first chapter for more detailed information on VFP processes.
  5. We do not have any pre-routing or post-routing policies configured on either NVA. As a result, NVA2 simply routes the traffic out of the eth0 interface based on its routing table. The ingress TCP SYN message has already been processed by the VFP layers, and the GFT has been updated accordingly. Consequently, the egress packet can be forwarded based on the GFT without the need for additional processing by the VFP layers.
  6. Subsequently, the encapsulated TCP SYN message is transmitted over VNet peering to vm-prod-1, located on vnet-spoke-1. Upon reaching the hosting server of vm-prod-1, the packet is processed in a similar manner as we observed with NVA. The encapsulation is removed, and the packet undergoes the same VFP processing steps as before.


Figure 6-1: ILB Example Topology.

Tuesday, 6 June 2023

NVA Part IV: NVA Redundancy with Azure Internal Load Balancer

Introduction

To achieve active/active redundancy for a Network Virtual Appliance (NVA) in a Hub-and-Spoke VNet design, we can utilize an Internal Load Balancer (ILB) to enable Spoke-to-Spoke traffic.

Figure 5-1 illustrates our example topology, which consists of a vnet-hub and spoke VNets. The ILB is associated with the subnet 10.0.1.0/24, where we allocate a Frontend IP address (FIP) using dynamic or static methods. Unlike a public load balancer's inbound rules, we can choose the High-Availability (HA) ports option to load balance all TCP and UDP flows. The backend pool and health probe configurations remain the same as those used with a Public Load Balancer (PLB).

From the NVA perspective, the configuration is straightforward. We enable IP forwarding in the Linux kernel and virtual NIC but not pre-routing (destination NAT). We can use Post-routing policies (source NAT) if we want to hide real IP addresses or if symmetric traffic paths are required. To route egress traffic from spoke sites to the NVAs via the ILB, we create subnet-specific route tables in the spoke VNets. The reason why the "rt-spoke1" route table has an entry "10.2.0.0/24 > 10.0.1.6 (ILB)" is that vm-prod-1 has a public IP address used for external access. If we were to set the default route, as we have in the subnet 10.2.0.0/24 in "vnet-spoke2", the external connection would fail.

Figure 5-1: ILB Example Topology.

Saturday, 20 May 2023

NVA Part III: NVA Redundancy – Connection from the Internet

This chapter is the first part of a series on Azure's highly available Network Virtual Appliance (NVA) solutions. It explains how we can use load balancers to achieve active/active NVA redundancy for connections initiated from the Internet.

In Figure 4-1, Virtual Machine (VM) vm-prod-1 uses the load balancer's Frontend IP address 20.240.9.27 to publish an application (SSH connection) to the Internet. Vm-prod-1 is located behind an active/active NVA FW cluster. Vm-prod-1 and NVAs have vNICs attached to the subnet 10.0.2.0/24.

Both NVAs have identical Pre- and Post-routing policies. If the ingress packet's destination IP address is 20.240.9.27 (load balancer's Frontend IP) and the transport layer protocol is TCP, the policy changes the destination IP address to 10.0.2.6 (vm-prod-1). Additionally, before routing the packet through the Ethernet 1 interface, the Post-routing policy replaces the original source IP with the IP address of the egress interface Eth1.

The second vNICs of the NVAs are connected to the subnet 10.0.1.0/24. We have associated these vNICs with the load balancer's backend pool. The Inbound rule binds the Frontend IP address to the Backend pool and defines the load-sharing policies. In our example, the packets of SSH connections from the remote host to the Frontend IP are distributed between NVA1 and NVA2. Moreover, an Inbound rule determines the Health Probe policy associated with the Inbound rule.

Note! Using a single VNet design eliminates the need to define static routes in the subnet-specific route table and the VM's Linux kernel. This solution is suitable for small-scale implementations. However, the Hub-and-Spoke VNet topology offers simplified network management, enhanced security, scalability, performance, and hybrid connectivity. I will explain how to achieve NVA redundancy in the Hub-and-Spoke VNet topology in upcoming chapters.



Figure 4-1: Example Diagram. 

Tuesday, 11 April 2023

NVA Part II - Internet Access with a single NVA

Introduction

In the previous chapter, you learned how to route east-west traffic through the Network Virtual Appliance (NVA) using subnet-specific route tables with User Defined Routes (UDR). This chapter introduces how to route north-south traffic between the Internet and your Azure Virtual Network through the NVA.

Figure 3-1 depicts our VNet setup, which includes DMZ and Web Tier zones. The NVA, vm-nva-fw, is connected to subnet snet-north (10.0.2.0/24) in the DMZ via a vNIC with Direct IP (DIP) 10.0.2.4. We've also assigned a public IP address, 51.12.90.63, to this vNIC. The second vNIC is connected to subnet snet-west (10.0.0.0/24) in the Web Tier, with DIP 10.0.0.5. We have enabled IP Forwarding in both vNICs and Linux kernel. We are using Network Security Groups (NSGs) for filtering north-south traffic.

Our web server, vm-west, has a vNIC with DIP 10.0.0.4 that is connected to the subnet snet-west in the Web Tier. We have associated the route table to the subnet with the UDR, which forwards traffic to destination IP 141.192.166.81 (remote host) to NVA. To publish the web server to the internet, we've used the public IP of NVA. 

On the NVA, we have configured a Destination NAT rule which rewrites the destination IP address to 10.0.0.4 to packets with the source IP address 141.192.166.81 and protocol ICMP. To simulate an http connection, we're using ICMP requests from a remote host.


Figure 3-1: Example Diagram and.

Monday, 3 April 2023

Routing in Azure Subnets

Introduction

Subnets, aka Virtual Local Area Networks (VLANs) in traditional networking, are Layer-2 broadcast domains that enable attached workloads to communicate without crossing a Layer-3 boundary, the subnet Gateway. Hosts sharing the same subnet resolve each other’s MAC-IP address binding using Address Resolution Protocol, which relays on Broadcast messages. That is why we often use the Failure domain definition with subnets. We can spread subnets between physical devices over Layer-2 links using VLAN tagging, defined in the IEEE 802.1Q standard. Besides, tunnel encapsulation solutions supporting tenant/context identifier enables us to extend subnets over Layer-3 infrastructure. Virtual eXtensible LAN (VXLAN) using VXLAN Network Identifier (VNI) and Network Virtualization using Generic Route Encapsulation (NVGRE) using Tenant Network ID (TNI) are examples of Network Virtualization Over Layer 3 (NVO) solutions. If you have to spread the subnet over MPLS enabled network, you can choose to implement Virtual Private LAN (VPLS) Service or Virtual Private Wire Service (VPWS), among the other solutions.  

In Azure, the concept of a subnet is different. You can think about it as a logical domain within a Virtual Network (VNet), where attached VMs share the same IP address space and use the same shared routing policies. Broadcast and Multicast traffic is not natively supported in Azure VNet. However, you can use a cloudSwXtch VM image from swXtch.io to build a Multicast-enabled overlay network within VNet. 

Default Routing in Virtual Network

This section demonstrates how the routing between subnets within the same Virtual Network (VNet) works by default. Figure 2-1 illustrates our example Azure VNet setup where we have deployed two subnets. The interface eth0 of vm-west and interface eth1 of vm-nva-fw are attached to subnet snet-west (10.0.0.0/24), while interface eth2 of vm-nva-fw and interface eth0 of vm-west is connected to subnet snet-east (10.0.1.0/24). All three VMs use the VNet default routing policy, which routes Intra-VNet data flows directly between the source and destination endpoint, regardless of which subnets they are connected to. Besides, the Network Security Groups (NSGs) associated with vNICs share the same default security policies, which allow inbound and outbound Intra-VNet data flows, InBound flows from the Load Balancer, and OutBound Internet connections. 

Now let’s look at what happens when vm-west (DIP: 10.0.0.4) pings vm-west (DIP: 10.0.1.4), recapping the operation of VFP. Note that Accelerated Networking (AccelNet) is enabled in neither VMs.

  1. The VM vm-west sends an ICMP Request message to vm-east. The packet arrives at the Virtual Filtering Platform (VFP) for processing. Since this is the first packet of the flow, the Flow Identifier and associated Actions are not in the Unified Flow Table (UFT). The Parser component extracts the 5-tuple header information (source IP, source port, destination IP, destination port, and transport protocol) as metadata from the original packet. The metadata is then processed in each VFP layer to generate a flow-based entry in the UFT.
  2. The destination IP address matches the Network Security Group's (NSG) default outbound rule, which allows Intra-VNet flows. Then the metadata is passed on to the routing process. Since we haven't yet deployed subnet-specific route tables, the result of the next-hop route lookup is 3.3.3.3, the Provider Address (PA) of Host-C.
  3. Intra-VNet connections use private IP addresses (DIP-Direct IP), and the VFP process bypasses the NAT layer. The VNet layer, responsible for encapsulation/decapsulation, constructs tunnel headers (IP/UDP/VXLAN). It creates the outer IP address with the source IP 1.1.1.1 (Host-A) and destination IP 3.3.3.3 (Host-C), resolved by the Routing layer. Besides, it adds Virtual Network Identifier (VNI) into the VXLAN header.
  4. After each layer has processed the metadata, the result is encoded to Unified Flow Table (UFT) with Flow-Id with push action (Encapsulation). 
  5. The Header Transposition engine (HT) modifies the original packet based on the UFT actions. It adds tunnel headers leaving all original header information intact. Finally, the modified packet is transmitted to the upstream switch. The subsequent packets are forwarded based on the UFT.
  6. The Azure switching infra forwards the packet based on the destination IP address on the outer IP header (tunnel header).
  7. The VFP on Host-C processes the ingress ICMP Request message in the same manner as VFP in Host-A but in reversed order starting with decapsulation in the VNet layer.

Figure 2-1: Example Topology Diagram.

Wednesday, 22 March 2023

Chapter 1: Azure VM networking – Virtual Filtering Platform and Accelerated Networking

 Note! This post is under the technical review

Introduction


Virtual Filtering Platform (VFP) is Microsoft’s cloud-scale software switch operating as a virtual forwarding extension within a Hyper-V basic vSwitch. The forwarding logic of the VFP uses a layered policy model based on policy rules on Match-Action Table (MAT). VFP works on a data plane, while complex control plane operations are handed over to centralized control systems. The VFP includes several layers, including VNET, NAT, ACL, and Metering layers, each with dedicated controllers that program policy rules to the MAT using southbound APIs. The first packet of the inbound/outbound data flow is processed by VFP. The process updates match-action table entries in each layer, which then are copied into the Unified Flow Table (UFT). Subsequent packets are then switched based on the flow-based action in UFT. However, if the Virtual Machine is not using Accelerated Networking (AccelNet), all packets are still forwarded over the software switch, which requires CPU cycles. Accelerated Networking reduces the host’s CPU burden and provides a higher packet rate with a more predictable jitter by switching the packet using hardware NIC yet still relaying to VFP from the traffic policy perspective.


Hyper-V Extensible Virtual Switch


Microsoft’s extensible vSwitch running on Hyper-V operates as a Networking Virtualization Service Provider (NetVSP) for Virtual Machine. VMs, in turn, are Network Virtualization Service Consumers (NetVSP). When a VM starts, it requests the Hyper-V virtualization stack to connect to the vSwitch. The virtualization stack creates a virtual Network Interface (vNIC) for the VM and associates it with the vSwitch. The vNIC is presented to the VM as a physical network adapter. The communication channel between VM and vSwitch uses a synthetic data path Virtual Machine Bus (VMBus), which provides a standardized interface for VMs to access physical resources on the host machine. It helps ensure that virtual machines have consistent performance and can access resources in a secure and isolated manner. 


Virtual Filtering Platform - VFP


A Virtual Filtering Platform (VFP) is Microsoft’s cloud-scale virtual switch operating as a virtual forwarding extension within a Hyper-V basic vSwitch. VFP sits in the data path between virtual ports facing the virtual machines and default vPort associated with physical NIC. VFP uses VM’s vPort-specific layers for filtering traffic to and from VM. A layer in the VFP is a Match-Action Table (MAT) containing policy rules programmed by independent, centralized controllers. The packet is processed through the VFP layers if it’s an exception packet, i.e., no Unified Flow entry (UF) in the Unified Flow Table (UFT), or if it’s the first packet of the flow (TCP SYN packet). When a Virtual Machine initiates a new connection, the first packet of the data flow is stored in the Received Queue (RxQ). The Parser component on VFP then takes the L2 (Ethernet), L3 (IP), and L4 (Protocol) header information as metadata, which is then processed through the layer policies in each VFP layer. The VFP layers involved in packet processing depend on the flow destination and the Azure services associated with the source/destination VM. 

VNET-to-Internet traffic from with VM using a Public IP


The metering layer measures traffic for billing. It is the first layer for VM’s outgoing traffic and the last layer for incoming traffic, i.e., it processes only the original ingress/egress packets ignoring tunnel headers and other header modifications (Azure does not charge you for overhead bytes caused by the tunnel encapsulation). Next, the ACL layer runs the metadata through the NSG policy statements. If the source/destination IP addresses (L3 header group) and protocol, source/destination ports (L4 header group) match one of the allowing policy rules, the traffic is permitted (action#1: Allow). After ACL layer processing, the routing process intercepts the metadata. Because the destination IP address in the L3 header group matches only with the default route (0.0.0.0/0, next-hop Internet), the metadata is handed over to Server Load Balancing/Network Address Translation (SLB/NAT) layer. In this example, a public IP is associated with VM’s vNIC, so the SLB/NAT layer translates the private source IP to the public IP (action#2: Source NAT). The VNet layer is bypassed if both source and destination IP addresses are from the public IP space. When the metadata is processed by each layer, the results are programmed into the Unified Flow Table (UFT). Each flow is identified with a unique Unified Flow Identifier (UFID) - hash value calculated from the flow-based 5-tuple (source/destination IP, Protocol, Source Port, Destination Port). The UFID is also associated with the actions Allow and Source NAT. The Header Transposition (HT) engine then takes the original packet from the RxQ and modifies its L2/L3/L4 header groups as described in the UFT. It changes the source private IP to public IP (Modify) and moves the packet to TxQ. The subsequent packets of the flow are modified by the HT engine based on the existing UFT entry without running related metadata through the VFP layers (slow-path to fast-path switchover). 

Besides the outbound flow entry, the VFP layer processes generate an inbound flow entry for the same connection but with reversed 5-tuple (source/destination addresses and protocol ports in reversed order) and actions (destination NAT instead of source NAT). These outbound and inbound flows are then paired and seen as a connection, enabling the Flow State Tracking process where inactive connections can be deleted from the UFT. For example, the Flow State Machine tracks the TCP RST flags. Let’s say that the destination endpoint sets the TCP RST flags to the L4 header. The TCP state machine notices it and removes the inbound flow together with its paired outbound flow from the UFT. Besides, the TCP state machine tracks the TCP FIN/FIN ACK flags and TIME_WAIT state (after TCP FIN. The connection is kept alive for max. 2 x Max Segment Lifetime to wait if there are delayed/retransmitted packets).


Intra-VNet traffic



The Metering and ACL layers on VFP process inbound/outbound flows for Intra-VNet connections in the same manner as VNet-Internet traffic. When the routing process notices that the destination Direct IP address (Customer Address space) is within the VNet CIDR range, the NAT layer is bypassed. The reason is that Intra-VNet flows use private Direct IP addresses as source and destination addresses. The Host Agent responsible for VNet layer operations, then examines the destination IP address from the L3 header group. Because this is the first packet of the flow, there is no information about the destination DIP-to-physical host mapping (location information) in the cache table. The VNet layer is responsible for providing tunnel headers to Intra-VNet traffic, so the Host Agent requests the location information from the centralized control plane. After getting the reply, it creates a MAT entry where the action part defines tunnel headers (push action). After the metadata is processed, the result is programmed into Unified Flow Table. As a result, the Header Transposition engine takes the original packet from the Received Queue, adds a tunnel header, and moves the packet to Transmit Queue.

Figure 1-1: Azure Host-Based SDN Building Blocks.

Thursday, 23 February 2023

Azure Networking Fundamentals: Virtual WAN Part 2 - VNet Segmentation

VNets and VPN/ExpressRoute connections are associated with vHub’s Default Route Table, which allows both VNet-to-VNet and VNet-to-Remote Site IP connectivity. This chapter explains how we can isolate vnet-swe3 from vnet-swe1 and vnet-swe2 using VNet-specific vHub Route Tables (RT), still allowing VNet-to-VPN Site connection. As a first step, we create a Route Table rt-swe12 to which we associate VNets vnet-swe1 and vnet-swe2. Next, we deploy a Route Table rt-swe3 for vnet-swe3. Then we propagate routes from these RTs to Default RT but not from rt-swe12 to rt-swe3 and vice versa. Our VPN Gateway is associated with the Default RT, and the route to remote site subnet 10.11.11.0/24 is installed into the Default RT. To achieve bi-directional IP connectivity, we also propagate routes from the Default RT to rt-swe-12 and rt-swe3. As the last step, we verify both Control Plane operation and Data Plane connections. 


Figure 12-1: Virtual Network Segmentation.

Sunday, 5 February 2023

Azure Networking Fundamentals: Virtual WAN Part 1 - S2S VPN and VNet Connections

 This chapter introduces Azure Virtual WAN (vWAN) service. It offers a single deployment, management, and monitoring pane for connectivity services such as Inter-VNet, Site-to-Site VPN, and Express Route. In this chapter, we are focusing on S2S VPN and VNet connections. The Site-to-Site VPN solutions in vWAN differ from the traditional model, where we create resources as an individual components. In this solution, we only deploy a vWAN resource and manage everything else through its management view. Figure 11-1 illustrates our example topology and deployment order. The first step is to implement a vWAN resource. Then we deploy a vHub. It is an Azure-managed VNet to which we assign a CIDR, just like we do with the traditional VNet. We can deploy a vHub as an empty VNet without associating any connection. A vHub deployment process launches a pair of redundant routers, which exchange reachability information with the VNet Gateway router and VGW instances using BGP. We intend to allow Inter-VNet data flows between vnet-swe1, vnet-swe2, and Branch-to-VNet traffic. For Site-to-Site VPN, we deploy VPN Gateway (VGW) into vHub. The VGW started in the vHub creates two instances, instance0, and instance1, in active/active mode. We don’t deploy a GatewaySubnet for VGW because Azure handles subnetting and assigns public and Private IP addresses to instances. Besides, Azure starts a vHub-specific BGP process and allocates a BGP ASN 65515 to the VGW regardless of the selected S2S routing model (static or dynamic). Note that when we connect VNets and branch site to vHub, the Hub Router exchanges routing information with VNet’s GWs and VGW instance using BGP. After the vHub and VGW deployment, we configure VPN site parameters such as IPsec tunnel endpoint IP address, BGP ASN, and peering IP address for the branch device. Then we connect VPN Site to vHub and download the remote device configuration file. The file format is JSON and presents the values/parameters for Site-to-Site VPN and BGP peering but not the device-specific configuration. As a last deployment step, we connect VNets to vHub. The VGW in vHub is associated with a default Route Table (RT), and VNets are associated with none by default. During the connection setup, we need to associate VNets also to default RT. When everything is in place, we verify that each component has the necessary routing information and that the IP connectivity is ok.

Figure 11-1: vWAN Diagram.

Thursday, 2 February 2023

Azure Networking Fundamentals: VNET Peering

Comment: Here is a part of the introduction section of the eight chapter of my Azure Networking Fundamentals book. I will also publish other chapters' introduction sections soon so you can see if the book is for you. The book is available at Leanpub and Amazon (links on the right pane).

This chapter introduces an Azure VNet Peering solution. VNet peering creates bidirectional IP connections between peered VNets. VNet peering links can be established within and across Azure regions and between VNets under the different Azure subscriptions or tenants. The unencrypted data path over peer links stays within Azure's private infrastructure. Consider a software-level solution (or use VGW) if your security policy requires data path encryption. There is no bandwidth limitation in VNet Peering like in VGW, where BW is based on SKU. From the VM perspective, VNet peering gives seamless network performance (bandwidth, latency, delay, and jitter) for Inter-VNet and Intra-VNet traffic. Unlike the VGW solution, VNet peering is a non-transitive solution, the routing information learned from one VNet peer is not advertised to another VNet peer. However, we can permit peered VNets (Spokes) to use local VGW (Hub) and route Spoke-to-Spoke data by using a subnet-specific route table (chapter 9 explains the concept in detail). Note that by deploying a VNet peering, we create a bidirectional, always-on IP data path between VNets. However, we can prevent traffic from crossing the link if needed without deleting the peering. Azure uses Virtual Network Service Tags for VNet peering traffic policy.

Figure 8-1 shows our example topology. We create a VNet Peering between vnet-spoke-2 and vnet-nsg-rt-swedencentral. Besides the Inter-VNet connection, our solution allows vnet-spoke-2 to use vnet-nsg-rt-swedencentral as a transit VNet to other peered VNets (which we don’t have in this example). We also permit IP connection to/from vnet-spoke-2 to vnet-spoke-1 and on-prem location by authorizing vnet-spoke-2 to use vgw-nwkt as a transit gateway.

Figure 8-1: VNet Peering Example Diagram.

Sunday, 29 January 2023

Azure Networking Fundamentals: Site-to-Site VPN

Comment: Here is a part of the introduction section of the fifth chapter of my Azure Networking Fundamentals book. I will also publish other chapters' introduction sections soon so you can see if the book is for you. The book is available at Leanpub and Amazon (links on the right pane).

A Hybrid Cloud is a model where we split application-specific workloads across the public and private clouds. This chapter introduces Azure's hybrid cloud solution using Site-to-Site (S2S) Active-Standby VPN connection between Azure and on-prem DC. Azure S2S A/S VPN service includes five Azure resources. The first one, Virtual Network Gateway (VGW), also called VPN Gateway, consists of two VMs, one in active mode and the other in standby mode. These VMs are our VPN connection termination points on the Azure side, which encrypt and decrypt data traffic. The active VM has a public IP address associated with its Internet side. If the active VM fails, the standby VM takes the active role, and the public IP is associated with it. Active and standby VMs are attached to the special subnet called Gateway Subnet. The name of the gateway subnet has to be GatewaySubnet. The Local Gateway (LGW) resource represents the VPN termination point on the on-prem location. Our example LGW is located behind the NAT device. The inside local IP address of LGW is the private IP 192.168.100.18, which the NAT device translates to public IP 91.156.51.38. Because of this, we set our VGW in ResponderOnly mode. The last resource is the Connection resource. It defines the tunnel type and its termination points. In our example, we are using Site-to-Site (IPSec) tunnels, which are terminated to our VGW and LGW.


Figure 5-1: Active-Standby Site-to-Site VPN Overview.

Thursday, 26 January 2023

Azure Networking Fundamentals: Internet Access with VM-Specific Public IP

Comment: Here is a part of the introduction section of the Third chapter of my Azure Networking Fundamentals book. I will also publish other chapters' introduction sections soon so you can see if the book is for you. The book is available at Leanpub and Amazon (links on the right pane).

In chapter two, we created a VM vm-Bastion and associated a Public IP address to its attached NIC vm-bastion154. The Public IP addresses associated with VM’s NIC are called Instance Level Public IP (ILPIP). Then we added a security rule to the existing NSG vm-Bastion-nsg, which allows an inbound SSH connection from the external host. Besides, we created VMs vm-front-1 and vm-Back-1 without public IP address association. However, these two VMs have an egress Internet connection because Azure assigns Outbound Access IP (OPIP) addresses for VMs for which we haven’t allocated an ILPIP (vm-Front-1: 20.240.48.199 and vm-Back-1-20.240.41.145). The Azure portal does not list these IP addresses in the Azure portal VM view. Note that neither user-defined nor Azure-allocated Public IP addresses are not configured as NIC addresses. Instead, Azure adds them as a One-to-One entry to the NAT table (chapter 15 introduces a NAT service in detail). Figure 3-1 shows how the source IP address of vm-Bastion is changed from 10.0.1.4 to 20.91.188.31 when traffic is forwarded to the Internet. The source IP address of the Internet traffic from vm-Front-1 and vm-Back-1 will also be translated in the same way. The traffic policy varies based on the IP address assignment mechanism. The main difference is that external hosts can initiate connection only with VMs with an ILPIP. Besides, these VMs are allowed to use TCP/UDP/ICMP, while VMs with the Azure assigned public IP address can only use TCP or UDP but not ICMP. 

Figure 3-1: Overview of the Azure Internet Access.



Tuesday, 24 January 2023

Azure Networking Fundamentals: Network Security Group (NSG)

Comment: Here is a part of the introduction section of the second chapter of my Azure Networking Fundamentals book. I will also publish other chapters' introduction sections soon so you can see if the book is for you. The book is available at Leanpub and Amazon (links on the right pane). 

This chapter introduces three NSG scenarios. The first example explains the NSG-NIC association. In this section, we create a VM that acts as a Bastion host*). Instead of using the Azure Bastion service, we deploy a custom-made vm-Bastion to snet-dmz and allow SSH connection from the external network. The second example describes the NSG-Subnet association. In this section, we launch vm-Front-1 in the front-end subnet. Then we deploy an NSG that allows SSH connection from the Bastion host IP address. The last part of the chapter introduces an Application Security Group (ASG), which we are using to form a logical VM group. We can then use the ASG as a destination in the security rule in NSG. There are two ASGs in figure 2-1. We can create a logical group of VMs by associating them with the same Application Security Group (ASG). The ASG can then be used as a source or destination in NSG security rules. In our example, we have two ASGs, asg-Back (associated with VMs 10.0.2.4-6) and asg-Back#2 (associated with VMs 10.0.2.7-9). The first ASG (asg-Back) is used as a destination in the security rule on the NSG nsg-Back that allows ICMP from VM vm-Front-1. The second ASG (asg-Back#2) is used as a destination in the security rule on the same NSG nsg-Back that allows ICMP from VM vm-Bastion. Examples 1-7 and 1-8 show how we can get information about Virtual Networks using Azure AZ PowerShell.

*) Azure Bastion is a managed service for allowing SSH and RDP connections to VMs without a public IP address. Azure Bastion has a fixed price per hour and outbound data traffic-based charge.                            


Figure 2-1: Network Security Group (NSG) – Example Scenarios.

Wednesday, 11 January 2023

Azure Host-Based Networking: vNIC Interface Architecture - Synthetic Interface and Virtual Function

Before moving to the Virtual Filtering Platform (VFP) and Accelerated Network (AccelNet) section, let’s look at the guest OS vNIC interface architecture. When we create a VM, Azure automatically attaches a virtual NIC (vNIC) to it. Each vNIC has a synthetic interface, a VMbus device, using a netvsc driver. If the Accelerated Networking (AccelNet) is disabled on a VM, all traffic flows pass over the synthetic interface to the software switch. Azure hosts servers have Mellanox/NVIDIA Single Root I/O Virtualization (SR-IOV) hardware NIC, which offers virtual instances, Virtual Function (VF), to virtual machines. When we enable AccelNet on a VM, the mlx driver is installed to vNIC. The mlx driver version depends on an SR-IOV type. The mlx driver on a vNIC initializes a new interface that connects the vNIC to an embedded switch on a hardware SR-IOV. This VF interface is then associated with the netvsc interface. Both interfaces use the same MAC address, but the IP address is only associated with the synthetic interface. When AccelNet is enabled, VM’s vNIC forwards VM data flows over the VF interface via the synthetic interface. This architecture allows In-Service Software Updates (ISSU) for SR-IOV NIC drivers. 

Note! Exception traffic, a data flow with no flow entries on a UFT/GFT, is forwarded through VFP in order to create flow-action entries to UFT/GFT.

Figure 1-1: Azure Host-Based SDN Building Blocks.

Sunday, 8 January 2023

Azure Host-Based Networking: VFP and AccelNet Introduction

Software-Defined Networking (SDN) is an architecture where the network’s control plane is decoupled from the data plane to centralized controllers. These intelligent, programmable controllers manage network components as a single system, having a global view of the whole network. Microsoft’s Azure uses a host-based SDN solution, where network virtualization and most of its services (Firewalls, Load balancers, Gateways) run as software on the host. The physical switching infrastructure, in turn, offers a resilient, high-speed underlay transport network between hosts.

Figure 1-1 shows an overview of Azure’s SDN architecture. Virtual Filtering Platform (VFP) is Microsoft’s cloud-scale software switch operating as a virtual forwarding extension within a Hyper-V basic vSwitch. The forwarding logic of the VFP uses a layered policy model based on policy rules on Match-Action Table (MAT). VFP works on a data plane, while complex control plane operations are handed over to centralized control systems. VFP layers, such as VNET, NAT, ACL, and Metering, have dedicated controllers that programs policy rules to MAT using southbound APIs.

Software switches switching processes are CPU intensive. To reduce the burden of CPU cycles, VFP offloads data forwarding logic to hardware NIC after processing the first packet of the flow and creating the flow entry to MAT. The Header Transposition (HT) engine programs flow and their forwarding actions, like source IP address rewrite, into a Unified Flow Table (UFT), which has flow entries for all active flows of every VM running on a host. Flows and policies on UFT are loaded into a Generic Flow Table (GFT) on the hardware NIC’s Field Programmable Gate Array (FPGA) unit and subsequent packets take a fast path over a hardware NIC. Besides GFT, a hardware NIC has Single Root I/O Virtualization (SR-IOV) NIC. It offers vNIC-specific, secure access between VM and hardware NIC. From the VM perspective, the SR-IOV NIC appears as a PCI device using a Virtual Function (VF) driver. The guest OS connection to VFP over VMBus uses a synthetic interface with Network Virtual Service Client (NetVSC) driver. NetVSC and VF interfaces are bonded and use the same MAC address. However, the IP address is attached to the NetVSC interface. A vNIC exposes only the synthetic interface to the TCP/IP stack of the guest OS. This solution makes it possible to switch flows from the fast (VF) path to the slow path (NetVSC) during a hardware NIC service operation or failure event without disturbing active connections.

VFP software switch and FPGA/SR-IOV hardware NIC together forms Microsoft’s host-based-SDN architecture called Accelerated Network (AccelNet). This post series introduces the solution in detail.




Figure 1-1: Azure Host-Based SDN Building Blocks.


References

[1] Daniel Firestone et al., “VFP: A Virtual Switch Platform for Host SDN in the Public Cloud”, 2017

[2] Daniel Firestone et al., “Azure Accelerated Networking: SmartNICs in the Public Cloud”, 2018

Tuesday, 3 January 2023

Azure Host-Based SDN: Part 1 - VFP Introduction

Azure Virtual Filtering Platform (VFP) is Microsoft’s cloud-scale virtual switch operating as a virtual forwarding extension within a Hyper-V basic vSwitch. Figure 1-1 illustrates an overview of VFP building blocks and relationships with basic vSwitch. Let’s start the examination from the VM vm-nwkt-1 perspective. Its vNIC vm-cafe154 has a synthetic interface eth0 using a NetVSC driver (Network Virtual Service Client). The Hyper-V vSwitch on the Parent Partition is a Network Virtual Service Provider (NetVSP) with VM-facing vPorts. Vm-cafe154 is connected to vPort4 over the logical inter-partition communication channel VMBus. VFP sits in the data path between VM-facing vPorts and default vPort associated with physical NIC. VFP uses port-specific Layers for filtering traffic to and from VMs. A VFP Layer is a Match Action Table (MAT) having a set of policy Rules. Rules consist of Conditions and Actions and are divided into Groups. Each layer is programmed by independent, centralized Controllers without cross-controller dependencies.

Let’s take a concrete example of Layer/Group/Rule object relationship and management by examining the Network Security Group (NSG) in the ACL Layer. Each NSG has a default group for Infrastructure rules, which allows Intra-VNet traffic, outbound Internet connection, and load balancer communication (health check, etc.). We can’t delete, add or modify rules in this group. The second group has User Defined rules, which we can use to allow/deny traffic flows based on our security policy. An NSG Rule consists of Conditions and Actions. Condition defines the match policy using 5-tuple of src-dst IP/Protocol/src-dst Ports. A Condition is associated with an Action for matching data flows. In our example, we have an Inbound Infrastructure Rule with Condition/Action that allows connection initiation from VMs within the VNet. ACL layer control component is Security Controller. We use the Security Controller's Northbound API when we create or modify an NSG with Windows PowerShell or Azure GUI. Security Controllers, in turn, use a Southbound API to program our intent to VFP via Host Agent.

The next post explains how VFP handles outgoing/incoming data streams and creates Unified Flow Tables (UFT) from them using the Header Transposition solution.


Figure 1-1: Virtual Filtering Platform Overview (click to enlarge). 


Friday, 16 December 2022

Azure Host-Based SDN: Part 1 - VFP Introduction

Azure Virtual Filtering Platform (VFP) is Microsoft’s cloud-scale virtual switch operating as a virtual forwarding extension within a Hyper-V basic vSwitch. Figure 1-1 illustrates an overview of VFP building blocks and relationships with basic vSwitch. Let’s start the examination from the VM vm-nwkt-1 perspective. Its vNIC vm-cafe154 has a synthetic interface eth0 using a NetVSC driver (Network Virtual Service Client). The Hyper-V vSwitch on the Parent Partition is a Network Virtual Service Provider (NetVSP) with VM-facing vPorts. Vm-cafe154 is connected to vPort4 over the logical inter-partition communication channel VMBus. VFP sits in the data path between VM-facing vPorts and default vPort associated with physical NIC. VFP uses port-specific Layers for filtering traffic to and from VMs. A VFP Layer is a Match Action Table (MAT) having a set of policy Rules. Rules consist of Conditions and Actions and are divided into Groups. Each layer is programmed by independent, centralized Controllers without cross-controller dependencies.

Let’s take a concrete example of Layer/Group/Rule object relationship and management by examining the Network Security Group (NSG) in the ACL Layer. Each NSG has a default group for Infrastructure rules, which allows Intra-VNet traffic, outbound Internet connection, and load balancer communication (health check, etc.). We can’t delete, add or modify rules in this group. The second group has User Defined rules, which we can use to allow/deny traffic flows based on our security policy. An NSG Rule consists of Conditions and Actions. Condition defines the match policy using 5-tuple of src-dst IP/Protocol/src-dst Ports. A Condition is associated with an Action for matching data flows. In our example, we have an Inbound Infrastructure Rule with Condition/Action that allows connection initiation from VMs within the VNet. ACL layer control component is Security Controller. We use the Security Controller's Northbound API when we create or modify an NSG with Windows PowerShell or Azure GUI. Security Controllers, in turn, use a Southbound API to program our intent to VFP via Host Agent.

The next post explains how VFP handles outgoing/incoming data streams and creates Unified Flow Tables (UFT) from them using the Header Transposition solution.

Figure 1-1: Overview of Virtual Filtering Platform (click to enlarge).










Friday, 19 November 2021

AWS Networking Fundamentals book: Table of Contents

Here is the Table of Contents of my AWS Networking Fundamentals book. I have added the figures which illustrate the example scenarios in each chapter. The book is available at Leanpub.com. It is still in progress, and there will be additional chapters soon.



 

Friday, 15 October 2021

AWS Networking - Part XI: VPC NAT Gateway

Introduction


Back-End EC2 instances like Application and Database servers are most often launched on a Private subnet. As a recap, a Private subnet is a subnet that doesn’t have a route to the Internet Gateway in its Route table. Besides, EC2 instances in the Private subnet don’t have Elastic-IP address association. These two facts mean that EC2 instances on the Private subnet don’t have Internet access. However, these EC2 instances might still need occasional Internet access to get firmware upgrades from the external source. We can use a NAT Gateway (NGW) for allowing IPv4 Internet traffic from Private subnets to the Internet. When we launch an NGW, we also need to allocate an Elastic-IP address (EIP) and associate it with the NGW. This association works the same way as the EIP-to-EC2 association. It creates a static NAT entry to IGW that translates  NGW’s local subnet address to its associated EIP. The NGW, in turn, is responsible for translating the source IP address from the ingress traffic originated from the Private subnet to its local subnet IP address. As an example, EC2 instance NWKT-EC2-Back-End sends packets towards the Internet to NGW. When the NGW receives these packets, it rewrites the source IP address 10.10.1.172 with its Public subnet IP address 10.10.0.195 and forwards packets to the Internet gateway. IGW translates the source IP address 10.10.0.195 to EIP 18.132.96.95 (EIP associated with NGW). That means that the source IP of data is rewritten twice, first by NGW and then by IGW.

Figure 4-1 illustrates our example NAT GW design and its configuration steps. As a pretask, we launch an EC2 instance on the Private subnet 10.10.1.0/24 (1). We also modify the existing Security Group (SG) to allow an Inbound/Outbound ICMP traffic within VPC CIDR 10.10.0.0/16 (2). We also allow an SSH session initiation from the 10.10.0.218/24. I’m using the same SG for both EC2 instances to keep things simple. Besides, both EC2 uses the same Key Pair. Chapter 3 shows how to launch an EC2 instance and how we modify the SGs, and that is why we go ahead straight to the NGW configuration.

When we have done pre-tasks, we launch an NGW on the Public subnet (3). Then we allocate an EIP and associate it with NGW (4). Next, we add a default route towards NGW on the Private subnet Route Table (5).

The last three steps are related to connectivity testing. First, verify Intra-VPC IP connectivity using ICMP (6). Then we test the Internet connectivity (7). As the last step, we can confirm that no route exists back to NWKT-EC-Backe-End from the IGW. We are using an AWS Path Analyzer for that (8).

Note! Our example doesn’t follow good design principles. AWS Availability Zones (AZ) are restricted failure domains, which means that failure in one AZ doesn’t affect the operation of other AZ. Now, if our NGW on AZ eu-west-2c fails,  Internet traffic from the Private subnet on eu-west2a fails. The proper design is to launch NGW on the AZ where unidirectional egress Internet access is needed.


Figure 4-1: Example Topology.

Monday, 11 October 2021

AWS Networking - Part X: VPC Internet Gateway Service - Part Two

 

Associate SG and Elastic-IP with EC2


In the previous section, we create an Internet Gateway for our VPC. We also add a static route towards IGW into the Route Table of Subnet 10.10.0.0/24. In this section, we first create a Security Group (SG).  The SG allows SSH connection to the EC2 instance and ICMP from the EC2. Then we launch an EC2 and attach the previously configure SG to it. As the last step, we allocate an Elastic IP address (EIP) from the AWS Ipv4 address pool and associate it with the EC instance. When we are done with all the previous steps, we will test the connection. First, we take ssh connection from MyPC to EC2. Then, we ping MyPC from the EC2. We also use AWS Reachability Analyzer to validate the path from IGE to EC2 instance. The last section introduces AWS billing related to this chapter.


Figure 3-20: EC2 Instance, Elastic IP, and Security Group.

 

Sunday, 10 October 2021

AWS Networking - Part X: VPC Internet Gateway Service - Part One


Introduction


This chapter explains what components/services and configurations we need to allow Internet traffic to and from an EC2 instance. VPCs themselves are closed entities. If we need an Internet connection, we need to use an AWS Internet Gateway (IGW) service. The IGW is running on a  Blackfoot Edge Device in the AWS domain. It performs Data-Plane VPC encapsulation and decapsulation, as well as  IP address translation. We also need public, Internet routable IP addresses. In our example, we allocate an AWS Elastic-IP (EIP) address. Then we associate it with EC2 Instance. By doing it, we don’t add the EIP to the EC2 instance itself. Instead, we create a static one-to-one NAT entry into the VPC associated IGW. The subnet Route Table includes only a VPC’s CIDR range local route. That is why we need to add a routing entry to the Subnet RT, default or more specific, towards IGW. Note that a subnet within an AWS VPC is not a Broadcast domain (VPC doesn’t even support Broadcasts). Rather, we can think of it as a logical place for EC2 instances having uniform connection requirements, like reachability from the Internet. As a next step, we define the security policy. Each Subnet has a Network Access Control List (NACL), which is a stateless Data-Plane filter. The Stateless definition means that to allow bi-directional traffic flow, we have to permit flow-specific Request/Reply data separately. For simplicity, we are going to use the Subnet Default NACL. The Security Group (SG), in turn, is a stateful EC2 instance-specific Data-Plane filter. The Stateful means that filter permits flow-based ingress and egress traffic. Our example security policy is based on the SG. We will allow an SSH connection from the external host 91.152.204.245 to EC2 instance NWKT-EC-Fron-End. In addition, we allow all ICMP traffic from the EC2 instance to the same external host. As the last part, this chapter introduces the Reachability Analyzer service, which we can use for troubleshooting connections. Figure 3-1 illustrates what we are going to build in this chapter.


Figure 3-1: Setting Up an Internet Connection for Public Subnet of AWS VPC.