Tuesday, 7 July 2020

BGP EVPN Underlay Network with OSPF

Introduction


The foundation of a modern Datacenter fabric is an Underlay Network and it is crucial to understand the operation of the Control-Plane protocol solution used in it. The focus of this chapter is OSPF. The first section starts by introducing the network topology and AS numbering scheme used throughout this book. The second section explains how OSPF speakers connected to the same segment become fully adjacent. The third section discusses the process of how OSPF speakers exchange Link State information and build a Link-State Database (LSDB) which is used as an information source for calculating Shortest Path Tree (SPT) towards each destination using Dijkstra algorithm. The focus of the fourth section is an OSPF LSA flooding process. It strat by explaining how local OSPF speaker sends Link State Advertisements wrapped inside a Link-State Update message to its adjacent router and how receiving OSPF speakers a) installs information into LSDB, b) Acknowledge the packet, and c) floods it out of OSPF interfaces. The fifth section discusses of LSA and SPF timers. At the end of this chapter, there are OSPF related configurations from every device.

Infrastructure AS Numbering and IP Addressing Scheme


Figure 1-1 illustrates an AS numbering and an IP address scheme used throughout this book. All Leaf switches have dedicated BGP Private AS number while spine switches in the same cluster share the same AS number. Inter-Switch links use Unnumbered IP addressing using (interface Loopback 0) which is also used as OSPF Router-Id. Loopback 0 is not advertised by any device. OSPF type for Inter-Switch link is point-to-point so there is no DR/BDR election process. Leaf switches also have interface Loopback 30 that is used as a VTEP (VXLAN Tunnel End Point) address. Loopback 30 IP addresses are advertised by Leaf switches. All Loopback interfaces are in OSPF passive interface mode. At this stage, all switches belong to OSPF Area 0.0.0.0.


Figure 1-1: AS Numbering and IP Addressing Scheme.

OSPF Neighbor Process


The OSPF neighbor process is explained by using three switches; L-101, S-11, and L-102.  At the starting point, the interface towards Spine-11 on Leaf-101 is down while Spine-11 and Leaf-102 are fully adjacent and their Link State DataBases are synchronized.

Phase 1.      The interface E1/1 is brought UP on Leaf-101 (1a). Leaf-101 receives three valid  OSPF Hello messages from Spine-11 before Leaf-101 itself sends the first OSPF Hello message to Spine-11 (1b-d). These messages don’t have Leaf-101 listed as an “Active Neighbor”. This means that Leaf-101 can’t be sure if Spine-11 knows its existence and that is why the state of the OSPF Finite State Machine (FSM) is INIT. When Spine-11 receives the OSPF Hello message sent by Leaf-101 (1e), it can now use the OSPF RID of Leaf-101 in the “Active Neighbor” field in the next OSPF hello message (1f). When Leaf-101 sees its own OSPF RID in the received message, it knows that Spine-11 has heard its OSPF Hello message and now the OSPF FSM state can be set to EXSTART. The OSPF process on Leaf-101 is shown in debug 1-1. Captures 1-1 to 1-3 show the actual packets exchange between Leaf-101 and Spine-11.

Figure 1-2: OSPF Neighbor Process: Init-OneWay-TwoWay.

        Scheduling hello for Ethernet1/1
          Hello timer start succeeded
      Created new neighbor 192.168.0.11
  Nbr 192.168.0.11 FSM start: old state DOWN, event HELLORCVD
  Nbr 192.168.0.11 FSM state changed from DOWN to INIT, event HELLORCVD
Nbr 192.168.0.11: DOWN --> INIT, event HELLORCVD
  Nbr 192.168.0.11 FSM start: old state INIT, event TWOWAYRCVD
  Nbr 192.168.0.11 FSM state changed from INIT to EXSTART, event ADJOK
Nbr 192.168.0.11: INIT --> EXSTART, event TWOWAYRCVD
  Nbr 192.168.0.11 FSM start: old state EXSTART, event HELLORCVD
Nbr 192.168.0.11: EXSTART --> EXSTART, event HELLORCVD
  Nbr 192.168.0.11 FSM start: old state EXSTART, event TWOWAYRCVD
Nbr 192.168.0.11: EXSTART --> EXSTART, event TWOWAYRCVD
  Nbr 192.168.0.11 FSM start: old state EXSTART, event HELLORCVD
Nbr 192.168.0.11: transitioning to OneWay - did not find ourselves
  Nbr 192.168.0.11 FSM start: old state EXSTART, event ONEWAYRCVD
  Nbr 192.168.0.11 FSM state changed from EXSTART to INIT, event ONEWAYRCVD
Nbr 192.168.0.11: EXSTART --> INIT, event ONEWAYRCVD
  Nbr 192.168.0.11 FSM start: old state INIT, event HELLORCVD
Nbr 192.168.0.11: INIT --> INIT, event HELLORCVD
  Nbr 192.168.0.11: transitioning to OneWay - did not find ourselves
  Nbr 192.168.0.11 FSM start: old state INIT, event ONEWAYRCVD
Nbr 192.168.0.11: INIT --> INIT, event ONEWAYRCVD
  Nbr 192.168.0.11 FSM start: old state INIT, event HELLORCVD
Nbr 192.168.0.11: INIT --> INIT, event HELLORCVD
  Nbr 192.168.0.11: transitioning to OneWay - did not find ourselves
  Nbr 192.168.0.11 FSM start: old state INIT, event ONEWAYRCVD
Nbr 192.168.0.11: INIT --> INIT, event ONEWAYRCVD
  Nbr 192.168.0.11 FSM start: old state INIT, event HELLORCVD
Nbr 192.168.0.11: INIT --> INIT, event HELLORCVD
  Nbr 192.168.0.11 FSM start: old state INIT, event TWOWAYRCVD
  Nbr 192.168.0.11 FSM state changed from INIT to EXSTART, event ADJOK
Nbr 192.168.0.11: INIT --> EXSTART, event TWOWAYRCVD
Debug 1-1: Debug OSPF Adjacency detail on Leaf-101.
Capture 1-2 shows the first OSPF Hello message sent by Spine-11. The source IP address is the OSPF RID while the destination IP address is 224.0.0.5 (AllSPFRouters). The OSPF Hello messages are targeted only to the connected network segment (TTL 1). Not that the packet is automatically marked with DSCP CS6, and in case of link congestion, these packets are prioritized over unclassified traffic. However, this has no impact if there is no QoS-policy defined between links (which is usually the case in DC). OSPF does not use either TCP/UDP as a transport protocol because it runs over IP (protocol 89). The OSPF Hello message validation process verifies that all highlighted fields in capture 1-1 match receivers OSPF settings while the source OSPF RID must be different than OSPF RID used by the receiver. In case of some of these rules are not meat, the OSPF adjacency is not formed.
Internet Protocol Version 4, Src: 192.168.0.11, Dst: 224.0.0.5
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0xc0 (DSCP: CS6, ECN: Not-ECT)
    Total Length: 64
    Identification: 0xa1b4 (41396)
    Flags: 0x0000
    ...0 0000 0000 0000 = Fragment offset: 0
    Time to live: 1
    Protocol: OSPF IGP (89)
    Header checksum: 0x7638 [validation disabled]
    [Header checksum status: Unverified]
    Source: 192.168.0.11
    Destination: 224.0.0.5
Open Shortest Path First
    OSPF Header
        Version: 2
        Message Type: Hello Packet (1)
        Packet Length: 44
        Source OSPF Router: 192.168.0.11
        Area ID: 0.0.0.0 (Backbone)
        Checksum: 0x3aec [correct]
        Auth Type: Null (0)
        Auth Data (none): 0000000000000000
    OSPF Hello Packet
        Network Mask: 0.0.0.0
        Hello Interval [sec]: 10
        Options: 0x02, (E) External Routing
        Router Priority: 1
        Router Dead Interval [sec]: 40
        Designated Router: 0.0.0.0
        Backup Designated Router: 0.0.0.0
Capture 1-1: The First OSPF Hello Message Sent by Spine-11.
Capture 1-2 shows the OSPF Hello packet sent by Leaf-101.
 Comment: I do not know why every second row is shown grey??? Sorry for that.
Internet Protocol Version 4, Src: 192.168.0.101, Dst: 224.0.0.5
Open Shortest Path First
    OSPF Header
        Version: 2
        Message Type: Hello Packet (1)
        Packet Length: 44
        Source OSPF Router: 192.168.0.101
        Area ID: 0.0.0.0 (Backbone)
        Checksum: 0x3a92 [correct]
        Auth Type: Null (0)
        Auth Data (none): 0000000000000000
    OSPF Hello Packet
        Network Mask: 0.0.0.0
        Hello Interval [sec]: 10
        Options: 0x02, (E) External Routing
        Router Priority: 1
        Router Dead Interval [sec]: 40
        Designated Router: 0.0.0.0
        Backup Designated Router: 0.0.0.0
Capture 1-2: The First OSPF Hello Message Sent by Leaf-101.
Capture 1-3 shows OSPF Hello packet send by Spine-11 after it has received the first OSPF Hello packet from Leaf-101. At this stage, Spine-11 uses Leaf-101 OSPF RID in the “Active Neighbor” field.

Internet Protocol Version 4, Src: 192.168.0.11, Dst: 224.0.0.5
Open Shortest Path First
    OSPF Header
        Version: 2
        Message Type: Hello Packet (1)
        Packet Length: 48
        Source OSPF Router: 192.168.0.11
        Area ID: 0.0.0.0 (Backbone)
        Checksum: 0x79da [correct]
        Auth Type: Null (0)
        Auth Data (none): 0000000000000000
    OSPF Hello Packet
        Network Mask: 0.0.0.0
        Hello Interval [sec]: 10
        Options: 0x02, (E) External Routing
        Router Priority: 1
        Router Dead Interval [sec]: 40
        Designated Router: 0.0.0.0
        Backup Designated Router: 0.0.0.0
        Active Neighbor: 192.168.0.101
Capture 1-3: The Fourth Hello Message Sent by Spine-11

Phase 2a.    Now the OSPF FSM state on Leaf-101 has moved from INIT to EXSTART. The purpose of EXSTART is to decide which router controls the Database synchronization process and to set a random starting sequence number. Because Leaf-101 has higher OSPF RID it takes control. Leaf-101 sends an empty DBD with Init (I) bit, More (M) bit, and Master/Slave (MS) bit all set to one. The I-bit indicates that this is the first DBD packet, M-bit indicates that there is more DBD packet to come (this one is empty), and MS-bit indicates that Leaf-101 wants to take the controller role for the rest of the adjacency processes.


Figure 1-3: OSPF Neighbor Process: Exstart.

Debug 1-2 shows the adjacency process on Leaf-101. Note that ddbits 0x7 in binary mode is 0000 0111 where three rightmost bits indicate I, M, and MS-bits.

    Sending DBD to 192.168.0.11 on Ethernet1/1
    Sent DBD with 0 entries to 192.168.0.11 on Ethernet1/1
      mtu 1500, opts: 0x42, ddbits: 0x7, seq: 0x5aec6dfa
      Got DBD from 192.168.0.11 with 2 entries
        seqnr 0x5aec6dfa, dbdbits 0x2, mtu 1500, options 0x42
          We are MASTER, 192.168.0.11 is slave
        Nbr 192.168.0.11 FSM start: old state EXSTART, event NEGDONE
            Preparing DBD exchange for nbr 192.168.0.11, 135/5
Debug 1-2: Debug OSPF Adjacency detail on Leaf-101 - ExStart.
Capture 1-4 shows the DBD packet send by Leaf-101

Internet Protocol Version 4, Src: 192.168.0.101, Dst: 224.0.0.5
Open Shortest Path First
    OSPF Header
        Version: 2
        Message Type: DB Description (2)
        Packet Length: 32
        Source OSPF Router: 192.168.0.101
        Area ID: 0.0.0.0 (Backbone)
        Checksum: 0x2c06 [correct]
        Auth Type: Null (0)
        Auth Data (none): 0000000000000000
    OSPF DB Description
        Interface MTU: 1500
        Options: 0x42, O, (E) External Routing
        DB Description: 0x07, (I) Init, (M) More, (MS) Master
            .... 0... = (R) OOBResync: Not set
            .... .1.. = (I) Init: Set
            .... ..1. = (M) More: Set
            .... ...1 = (MS) Master: Yes
        DD Sequence: 1525444090
Capture 1-4: The First Database Description (DD) Sent by Leaf-101. 

Phase 2b.   Spine-11 accepts that Leaf-101 can take control of the OSPF neighbor process for now on. It sends its DBD (capture 1-5) to Leaf-101 where I-bit and MS-bit are cleared and two Type-1 LASs (Router-LSA) are listed. The sequence number of DBD is the same as what was used by Leaf-101 on its first DBD message. Debug 1-3 shows the OSPF adjacency process on Leaf-101 when receiving the DBD from Spine-11. FSM state is changed from EXSTRAT to EXCHANGE meaning the OSPF LSDB synchronization process has now been started. Leaf-101 has neither LSA information in its LSDB so it adds those into LS Request list.

Nbr 192.168.0.11 FSM state changed from EXSTART to EXCHANGE, event NEGDONE
Nbr 192.168.0.11: EXSTART --> EXCHANGE, event NEGDONE
Got DBD from 192.168.0.11 with 2 entries
seqnr 0x5aec6dfa, dbdbits 0x2, mtu 1500, options 0x42
Found 192.168.0.11(0x1)192.168.0.11 (0x80000003) (0xc1ad) (65) in DBD
Added 192.168.0.11(0x1)192.168.0.11 (0x80000003) (0xc1ad) (65)(D) to request list
Found 192.168.0.102(0x1)192.168.0.102 (0x80000004) (0xff13) (65) in DBD
Added 192.168.0.102(0x1)192.168.0.102 (0x80000004) (0xff13) (65)(D) to request list
Added 2 out of 2 LSAs to request list
Debug 1-3: Debug OSPF Adjacency Detail on Leaf-101 - ExStart.

Internet Protocol Version 4, Src: 192.168.0.11, Dst: 224.0.0.5
Open Shortest Path First
    OSPF Header
        Version: 2
        Message Type: DB Description (2)
        Packet Length: 72
        Source OSPF Router: 192.168.0.11
        Area ID: 0.0.0.0 (Backbone)
        Checksum: 0x6316 [correct]
        Auth Type: Null (0)
        Auth Data (none): 0000000000000000
    OSPF DB Description
        Interface MTU: 1500
        Options: 0x42, O, (E) External Routing
        DB Description: 0x02, (M) More
            .... 0... = (R) OOBResync: Not set
            .... .0.. = (I) Init: Not set
            .... ..1. = (M) More: Set
            .... ...0 = (MS) Master: No
        DD Sequence: 1525444090
    LSA-type 1 (Router-LSA), len 36
        .000 0000 0100 0001 = LS Age (seconds): 65
        0... .... .... .... = Do Not Age Flag: 0
        Options: 0x02, (E) External Routing
        LS Type: Router-LSA (1)
        Link State ID: 192.168.0.11
        Advertising Router: 192.168.0.11
        Sequence Number: 0x80000003
        Checksum: 0xc1ad
        Length: 36
    LSA-type 1 (Router-LSA), len 48
        .000 0000 0100 0001 = LS Age (seconds): 65
        0... .... .... .... = Do Not Age Flag: 0
        Options: 0x02, (E) External Routing
        LS Type: Router-LSA (1)
        Link State ID: 192.168.0.102
        Advertising Router: 192.168.0.102
        Sequence Number: 0x80000004
        Checksum: 0xff13
        Length: 48
Capture 1-5: Database Description (DD) Sent by Spine-11.



Phase 3.      L-101 sends a Link State Request to S-11 (3a) where it requests LSAs from 192.168.0.11 and 192.168.0.102 (debug 1-4, capture 1-6). It also sends the Database Description of its LSDB (debug 1-5, capture 1-7). This one now includes description while the first DBD was only used for Master/Slave selection and sequence number generation. S-11 replies to LS Request by sending an LS Update packet, where it described requested LSA in detail (debug 1-6, capture 1-8).

Figure 1-4: OSPF Neighbor Process: Exstart.

Debug 1-4 shows how L-101 builds an LS Request based on previously received DBD packet from S-11.

Building LS Request packet to 192.168.0.11
        Add 192.168.0.11(0x1)192.168.0.11 (0x80000003) (0xc1ad) (65)(D) to LSR
        Add 192.168.0.102(0x1)192.168.0.102 (0x80000004) (0xff13) (65)(D) to LSR
Built LS Request packet for 192.168.0.11 with 2 entries
Debug 1-4: Debug OSPF Adjacency Detail on Leaf-101 – Link State Request.


Debug 1-5 shows the process of how L-101 generates its DB Description.

    Sending DBD to 192.168.0.11 on Ethernet1/1
            Add 192.168.0.101(0x1)192.168.0.101 (0x80000003) (0x5d67) (1300)(O) to DBD
          Filled DBD to 192.168.0.11 with 1 entries
    Sent DBD with 1 entries to 192.168.0.11 on Ethernet1/1
      mtu 1500, opts: 0x42, ddbits: 0x3, seq: 0x5aec6dfb
      Got DBD from 192.168.0.11 with 0 entries
       seqnr 0x5aec6dfb, dbdbits 0, mtu 1500, options 0x42
      Got DBD from 192.168.0.11 with 0 entries
        seqnr 0x5aec6dfb, dbdbits 0, mtu 1500, options 0x42
    Sending DBD to 192.168.0.11 on Ethernet1/1
          Filled DBD to 192.168.0.11 with 0 entries
    Sent DBD with 0 entries to 192.168.0.11 on Ethernet1/1
      mtu 1500, opts: 0x42, ddbits: 0x1, seq: 0x5aec6dfc
Debug 1-5: Debug OSPF Adjacency Detail on Leaf-101 – DB Description.




Debug 1-6 shows how L-101 receives the LS Request from S-11.

   Recv LSR from Nbr 192.168.0.11
      Got DBD from 192.168.0.11 with 0 entries
        seqnr 0x5aec6dfc, dbdbits 0, mtu 1500, options 0x42
      Got DBD from 192.168.0.11 with 0 entries
        seqnr 0x5aec6dfc, dbdbits 0, mtu 1500, options 0x42
        Nbr 192.168.0.11 FSM start: old state EXCHANGE, event EXCHDONE
        Nbr 192.168.0.11 FSM state changed from EXCHANGE to FULL, event EXCHDONE
Debug 1-6: Debug OSPF Adjacency Detail on Leaf-101 – Receiving LSR from S-11.

Debug 1-7 illustrates the process of how L-101 answers the LS request sent by S-11 as well as how the OSPF adjacency is now completed (Exchange to Full).

   Nbr 192.168.0.11: EXCHANGE --> FULL, event EXCHDONE
      Answering LSR from 192.168.0.11
        1 requests in LSR (1 left)
    Building reply LSU to 192.168.0.11
         Found requested LSA 192.168.0.101(1)192.168.0.101 for 192.168.0.11
          Added 192.168.0.101(0x1)192.168.0.101 (0x80000003) (0x5d67) (1300)(O)
   Built reply LSU with 1 LSAs for 192.168.0.11 84 bytes
        Nbr 192.168.0.11 FSM start: old state FULL, event HELLORCVD
        Nbr 192.168.0.11: FULL --> FULL, event HELLORCVD
        Nbr 192.168.0.11 FSM start: old state FULL, event TWOWAYRCVD
        Nbr 192.168.0.11: FULL --> FULL, event TWOWAYRCVD
Debug 1-7: Debug OSPF Adjacency detail on Leaf-101 – LSR and LU process.

Capture 1-6 shows the LS Request sent by L-101 where it asks full description of links connected to S-11 (192.168.0.11) and L-102 (192.168.0.102).

Open Shortest Path First
    OSPF Header
        Version: 2
        Message Type: LS Request (3)
        Packet Length: 48
        Source OSPF Router: 192.168.0.101
        Area ID: 0.0.0.0 (Backbone)
        Checksum: 0x3938 [correct]
        Auth Type: Null (0)
        Auth Data (none): 0000000000000000
    Link State Request
        LS Type: Router-LSA (1)
        Link State ID: 192.168.0.11
        Advertising Router: 192.168.0.11
    Link State Request
        LS Type: Router-LSA (1)
        Link State ID: 192.168.0.102
        Advertising Router: 192.168.0.102
Capture 1-6: Link-State Request from L-101 to S-11. 
Capture 1-7 shows the DataBase Description message sent by L-101 to S-11. DBD message only contains the OSPF RID of sending router, not any link information connected in it. 
Open Shortest Path First
    OSPF Header
        Version: 2
        Message Type: DB Description (2)
        Packet Length: 52
        Source OSPF Router: 192.168.0.101
        Area ID: 0.0.0.0 (Backbone)
        Checksum: 0xc535 [correct]
        Auth Type: Null (0)
        Auth Data (none): 0000000000000000
    OSPF DB Description
    LSA-type 1 (Router-LSA), len 36
        .000 0101 0001 0100 = LS Age (seconds): 1300
        0... .... .... .... = Do Not Age Flag: 0
        Options: 0x02, (E) External Routing
        LS Type: Router-LSA (1)
        Link State ID: 192.168.0.101
        Advertising Router: 192.168.0.101
        Sequence Number: 0x80000003
        Checksum: 0x5d67
        Length: 36
Capture 1-7: DB Description from L-101 to S-11. 
Capture 1-8 illustrates the LS Update sent by S-11 as a reply to LS Request from L-101. LS Update detailed information about the links and their type of both S-11 (the first Router-LSA) and L-102 (the second Router-LSA). 
Open Shortest Path First
    OSPF Header
        Version: 2
        Message Type: LS Update (4)
        Packet Length: 112
        Source OSPF Router: 192.168.0.11
        Area ID: 0.0.0.0 (Backbone)
        Checksum: 0x0c82 [correct]
        Auth Type: Null (0)
        Auth Data (none): 0000000000000000
    LS Update Packet
        Number of LSAs: 2
        LSA-type 1 (Router-LSA), len 36
            .000 0000 0100 0010 = LS Age (seconds): 66
            0... .... .... .... = Do Not Age Flag: 0
            Options: 0x02, (E) External Routing
            LS Type: Router-LSA (1)
            Link State ID: 192.168.0.11
            Advertising Router: 192.168.0.11
            Sequence Number: 0x80000003
            Checksum: 0xc1ad
            Length: 36
            Flags: 0x00
            Number of Links: 1
            Type: PTP      ID: 192.168.0.102   Data: 0.0.0.3         Metric: 40
                Link ID: 192.168.0.102 - Neighboring router's Router ID
                Link Data: 0.0.0.3
                Link Type: 1 - Point-to-point connection to another router
                Number of Metrics: 0 - TOS
                0 Metric: 40
        LSA-type 1 (Router-LSA), len 48
            .000 0000 0100 0010 = LS Age (seconds): 66
            0... .... .... .... = Do Not Age Flag: 0
            Options: 0x02, (E) External Routing
            LS Type: Router-LSA (1)
            Link State ID: 192.168.0.102
            Advertising Router: 192.168.0.102
            Sequence Number: 0x80000004
            Checksum: 0xff13
            Length: 48
            Flags: 0x00
            Number of Links: 2
            Type: Stub     ID: 192.168.31.102  Data: 255.255.255.255 Metric: 1
                Link ID: 192.168.31.102 - IP network/subnet number
                Link Data: 255.255.255.255
                Link Type: 3 - Connection to a stub network
                Number of Metrics: 0 - TOS
                0 Metric: 1
            Type: PTP      ID: 192.168.0.11    Data: 0.0.0.3         Metric: 40
                Link ID: 192.168.0.11 - Neighboring router's Router ID
                Link Data: 0.0.0.3
                Link Type: 1 - Point-to-point connection to another router
                Number of Metrics: 0 - TOS
                0 Metric: 40
Capture 1-8: Link-State Update from S-11 to L-101.

 At this stage, L-101 is fully adjacent to S-11 like L-102. 

Figure 1-5: OSPF Neighbor Process: Full Adjacency. 


Example 1-1 shows that the OSPF LSDB now contains all Link-State information about all links connected to both S-11 and L-102. Examples 1-2 show the OSPF LSDB of L-102 and example 1-3 shows the OSPF LSDB of S-11.

L-101# sh ip ospf database detail
        OSPF Router with ID (192.168.0.101) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.0)

   LS age: 1393
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.11
   Advertising Router: 192.168.0.11
   LS Seq Number: 0x80000007
   Checksum: 0x4220
   Length: 48
    Number of links: 2

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.102
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.101
     (Link Data) Router Interface address: 0.0.0.4
       Number of TOS metrics: 0
         TOS   0 Metric: 40

   LS age: 1391
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.101
   Advertising Router: 192.168.0.101
   LS Seq Number: 0x80000008
   Checksum: 0xfd14
   Length: 48
    Number of links: 2

     Link connected to: a Stub Network
      (Link ID) Network/Subnet Number: 192.168.31.101
      (Link Data) Network Mask: 255.255.255.255
       Number of TOS metrics: 0
         TOS   0 Metric: 1

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.11
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40

   LS age: 1399
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.102
   Advertising Router: 192.168.0.102
   LS Seq Number: 0x80000009
   Checksum: 0xf518
   Length: 48
    Number of links: 2

     Link connected to: a Stub Network
      (Link ID) Network/Subnet Number: 192.168.31.102
      (Link Data) Network Mask: 255.255.255.255
       Number of TOS metrics: 0
         TOS   0 Metric: 1

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.11
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40
Example 1-1: Link State Database on Leaf-101.

L-102# sh ip ospf data detail
        OSPF Router with ID (192.168.0.102) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.0)

   LS age: 1677
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.11
   Advertising Router: 192.168.0.11
   LS Seq Number: 0x80000007
   Checksum: 0x4220
   Length: 48
    Number of links: 2

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.102
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.101
     (Link Data) Router Interface address: 0.0.0.4
       Number of TOS metrics: 0
         TOS   0 Metric: 40

   LS age: 1677
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.101
   Advertising Router: 192.168.0.101
   LS Seq Number: 0x80000008
   Checksum: 0xfd14
   Length: 48
    Number of links: 2

     Link connected to: a Stub Network
      (Link ID) Network/Subnet Number: 192.168.31.101
      (Link Data) Network Mask: 255.255.255.255
       Number of TOS metrics: 0
         TOS   0 Metric: 1

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.11
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40

   LS age: 1681
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.102
   Advertising Router: 192.168.0.102
   LS Seq Number: 0x80000009
   Checksum: 0xf518
   Length: 48
    Number of links: 2

     Link connected to: a Stub Network
      (Link ID) Network/Subnet Number: 192.168.31.102
      (Link Data) Network Mask: 255.255.255.255
       Number of TOS metrics: 0
         TOS   0 Metric: 1

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.11
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40
Example 1-2: Link State Database on Leaf-102.

      S-11# sh ip ospf database detail

 OSPF Router with ID (192.168.0.11) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.0)

   LS age: 1524
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.11
   Advertising Router: 192.168.0.11
   LS Seq Number: 0x80000007
   Checksum: 0x4220
   Length: 48
    Number of links: 2

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.102
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.101
     (Link Data) Router Interface address: 0.0.0.4
       Number of TOS metrics: 0
         TOS   0 Metric: 40

   LS age: 1524
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.101
   Advertising Router: 192.168.0.101
   LS Seq Number: 0x80000008
   Checksum: 0xfd14
   Length: 48
    Number of links: 2

     Link connected to: a Stub Network
      (Link ID) Network/Subnet Number: 192.168.31.101
      (Link Data) Network Mask: 255.255.255.255
       Number of TOS metrics: 0
         TOS   0 Metric: 1

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.11
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40

   LS age: 1529
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.102
   Advertising Router: 192.168.0.102
   LS Seq Number: 0x80000009
   Checksum: 0xf518
   Length: 48
    Number of links: 2

     Link connected to: a Stub Network
      (Link ID) Network/Subnet Number: 192.168.31.102
      (Link Data) Network Mask: 255.255.255.255
       Number of TOS metrics: 0
         TOS   0 Metric: 1

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.11
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40
Example 1-3: Link State Database on S-11. 


Shortest-Path First (SPF)/Dijkstra Algorithm 

Dijkstra/SPF algorithm is used for calculating a Shortest-Path Tree (SPT) topology in OSPF Area. A router starts the process by setting itself as a root of the tree. At the first stage, the router builds a Shortest-Path Tree between routers by using the Type-1 Link Description (point-to-point) which describes links to neighbor routers in Router LSA. When the Shortest-Path Tree is formed, the router calculates the distance to subnets connected to each router by using the Link Type-3 (Stub) Link Description in Router LSA. 
Routers have two lists related to SPT calculation. The Candidate List (also known as a Tentative List) is the list that includes all routers that are currently examined by the router. The Tree List (also called Path or Known List) is the list, which includes all the routers participating in a final Shortest-Path Tree. Besides, a Link State Database (LSDB) is a source from where the information is pulled to calculation. 
The next section describes the SPT calculation process from L-101 perspectives. 



Figure 1-6 shows the initial situation where L-101 starts the Shortest-Path Tree calculation. L-101 inserts itself into the Candidate list with cost 0 and with next-hop pointing to itself. S-11, S-12, and L-102 are in Unknown-list at this phase. The Path-list is empty at the initial stage.
Figure 1-6: Shortest Path Calculation-1 st. Iteration round.

L-101# sh ip ospf data
        OSPF Router with ID (192.168.0.101) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum Link Count
192.168.0.11    192.168.0.11    235        0x80000006 0x441f   2
192.168.0.12    192.168.0.12    2          0x80000006 0x342d   2
192.168.0.101   192.168.0.101   224        0x80000006 0x3036   3
192.168.0.102   192.168.0.102   3          0x80000006 0x2a39   3
Example 1-4: Link State Database on L-101.


Figure 1-7 shows the first SPF iteration round. L-101 inserts itself to the Path List. L-101 examines its self-originated Router LSAs. It starts from the first Link Description (LD) found from the LSA. First LD is Link Type-3 (Stub) so it is ignored and used after the shortest path has been calculated. The next entry describes the link to S-12 (Link Type-1). L-101 moves S-11 into the Candidate list with cost 40. The last LD describes the link to S-11. L-101 move also S-11 to the Candidate List. L-102 is still in the Unknown-list.

Figure 1-7: Shortest Path calculation-The First Iteration round.

Example 1-5 shows the detailed LSDB of L-101. The first Link Description (LD) describes the interface Loopback 30 that is will later be used as an NVE Interface IP address (Network Virtualization Edge). It is ignored from the SPT calculation process due to its type “Stub network”. Remember that in the first phase we are forming Shortest Path Tree, and when that is done, we can calculate the best path to each destination.

 L-101# sh ip ospf database 192.168.0.101 detail
        OSPF Router with ID (192.168.0.101) (Process ID UNDERLAY-NET VRF default)
 
                Router Link States (Area 0.0.0.0)

   LS age: 301
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.101
   Advertising Router: 192.168.0.101
   LS Seq Number: 0x80000005
   Checksum: 0x3235
   Length: 60
    Number of links: 3

     Link connected to: a Stub Network
      (Link ID) Network/Subnet Number: 192.168.31.101
      (Link Data) Network Mask: 255.255.255.255
       Number of TOS metrics: 0
         TOS   0 Metric: 1

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.12
     (Link Data) Router Interface address: 0.0.0.2
       Number of TOS metrics: 0
         TOS   0 Metric: 40

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.11
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40
Example 1-5: Detailed Link State Database on L-101.

L-101 moves Both S-11 and S-12 to the Path-list due to their equal metric. They both have two point-to-point links in their OSPF LSDB. The link to L-101 itself is ignored because L-101 has a better cost for that link (connected). Links to L-102 have an equal cost on both switches so L-101 adds the L-102 into Candidate list with next-hop set to both S-11 and S-12.
Figure 1-8: Shortest Path Calculation-Second Iteration round.

L-101# sh ip ospf data 192.168.0.11 detail
        OSPF Router with ID (192.168.0.101) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.0)

   LS age: 354
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.11
   Advertising Router: 192.168.0.11
   LS Seq Number: 0x80000007
   Checksum: 0x4220
   Length: 48
    Number of links: 2

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.102
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.101
     (Link Data) Router Interface address: 0.0.0.4
       Number of TOS metrics: 0
         TOS   0 Metric: 40
Example 1-6: Router Links of S-11 installed into LSDB of L-101. 

L-101# sh ip ospf data 192.168.0.12 detail
        OSPF Router with ID (192.168.0.101) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.0)

   LS age: 135
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.12
   Advertising Router: 192.168.0.12
   LS Seq Number: 0x80000007
   Checksum: 0x322e
   Length: 48
    Number of links: 2

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.102
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.101
     (Link Data) Router Interface address: 0.0.0.4
       Number of TOS metrics: 0
         TOS   0 Metric: 40
Example 1-6: Router Links of S-12 installed into LSDB of L-101.
 Third iteration round

L-101 moves L-102 to the Path-list. The LSDB of L-102 contains two point-to-point links, one to S-11 and the other one to S-12. L-101 already has links to these routers so both links are ignored. Also, the Stub link pointing to L-102 Loopback 30 is ignored at this phase. Now the Shortest-Path is ready.

Figure 1-9: Shortest Path calculation-Third Iteration round.




L-101# sh ip ospf data 192.168.0.102 detail
        OSPF Router with ID (192.168.0.101) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.0)

   LS age: 1107
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.102
   Advertising Router: 192.168.0.102
   LS Seq Number: 0x80000007
   Checksum: 0x283a
   Length: 60
    Number of links: 3

     Link connected to: a Stub Network
      (Link ID) Network/Subnet Number: 192.168.31.102
      (Link Data) Network Mask: 255.255.255.255
       Number of TOS metrics: 0
         TOS   0 Metric: 1

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.12
     (Link Data) Router Interface address: 0.0.0.2
       Number of TOS metrics: 0
         TOS   0 Metric: 40

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.11
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 40
Example 1-7: Router Links of L-102 installed into LSDB of L-101. 

SPF Run – Phase II: Adding Leafs to Shortest-Path Tree

In the first phase, L-101 forms a Shortest-Path Tree (SPT) by using the Dijkstra/SPF algorithm. In the second Phase, Stub Networks used as an interface NVE IP address, are added into SPT. Once again L-101 starts by examining its self-originated Router LSA. It has a Link Description about Stub Network 192.168.31.101/32 in its LSDB, which is moved into RIB as directly connected. S-11 and S-12 do not have any Stub-Networks in their LSDB. L-102, however, has a Stub-Network 192.168.31.102/32. The SPT between L101 and L-102 includes two equal-cost paths via S-11 and S-12 and the network is inserted into RIB with two next-hop addresses. This means that traffic from L-101 to destination 192.168.31.102 will get flow-based load-balancing forwarding behavior.

Figure 1-10: Shortest Path calculation-Second Phase: Stub-Networks.

Note! You may wonder how the flow-based load-balancing works in VXLAN Fabric because of the source and destination IP addresses for VXLAN encapsulated packets between two VTEPs are always the same (depending on traffic direction). VXLAN uses UDP as a transport-layer protocol which is connectionless by nature that though has a source-port field in its header that is not used for application recognizer for return traffic like with reliable transport protocol TCP. The UDP source port in the VXLAN header is filled with the original source-port used by the application by the sending host.

Convergence


Whenever there is a link state change in OSPF Area, it triggers an LSA flooding process and an SPF calculation. In figure 1-11, the link between L-202 and S-22 fails. Both S-22 and L-202 reacts to this event running full SPF calculation. They both also send LS Update out of the each OSPF interface using AllSPFRouter multicast address 224.0.0.5 as a destination IP address. When adjacent OSPF speakers receive LS Updates, they run SPF and floods the LS Update further out of their OSPF interfaces. This process goes on like a wave through the whole OSPF area.




Figure 1-11: Effect of Single Link Failure.

We can verify this from L-101. Example 1-8 shows the LSDB of L-101 before link failure. The highlighted rows show the Router Link information of S-22 (Age 748 seconds) and L-202 (Age 749 seconds). The sequence number of S-22 is 0x8000007 and the link count is four. The sequence number of L-202 is 0x8000006 and the link count is three.

L-101# sh ip ospf database
        OSPF Router with ID (192.168.0.101) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum Link Count
192.168.0.1     192.168.0.1     1547       0x80000004 0xee80   3
192.168.0.2     192.168.0.2     1546       0x80000004 0xde8e   3
192.168.0.11    192.168.0.11    1540       0x80000004 0x481d   2
192.168.0.12    192.168.0.12    1547       0x80000006 0xd348   4
192.168.0.21    192.168.0.21    1553       0x80000006 0xa29e   4
192.168.0.22    192.168.0.22    748        0x80000007 0x90ad   4
192.168.0.101   192.168.0.101   1544       0x80000004 0x3434   3
192.168.0.102   192.168.0.102   1544       0x80000004 0x2e37   3
192.168.0.201   192.168.0.201   1551       0x80000004 0x1511   3
192.168.0.202   192.168.0.202   749        0x80000006 0x0b16   3
Example 1-8: OSPF LSDB of L-101 before link failure between S-22 and L-202.
 Example 1-9 illustrates that so far SPF calculation has run 125 times.
 L-101# sh ip ospf | sec Area
   Area BACKBONE(0.0.0.0)
        Area has existed for 04:54:33
        Interfaces in this area: 3 Active interfaces: 3
        Passive interfaces: 0  Loopback interfaces: 0
        No authentication available
        SPF calculation has run 125 times
         Last SPF ran for 0.010751s
        Area ranges are
        Number of LSAs: 10, checksum sum 0x49e50
Example 1-9: SPF calculation count on L-101 before link failure between S-22 and L-202.
 Example 1-10 shows the link failure can be seen in LSDB of L-101. The age of the Router Link entries is now 55 seconds and the link count is reduced by one.
 L-101# sh ip ospf database
        OSPF Router with ID (192.168.0.101) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum Link Count
192.168.0.1     192.168.0.1     1643       0x80000004 0xee80   3
192.168.0.2     192.168.0.2     1643       0x80000004 0xde8e   3
192.168.0.11    192.168.0.11    1636       0x80000004 0x481d   2
192.168.0.12    192.168.0.12    1644       0x80000006 0xd348   4
192.168.0.21    192.168.0.21    1650       0x80000006 0xa29e   4
192.168.0.22    192.168.0.22    55         0x80000008 0x604a   3
192.168.0.101   192.168.0.101   1640       0x80000004 0x3434   3
192.168.0.102   192.168.0.102   1641       0x80000004 0x2e37   3
192.168.0.201   192.168.0.201   1647       0x80000004 0x1511   3
192.168.0.202   192.168.0.202   55         0x80000007 0x8552   2
Example 1-10: OSPF LSDB of L-101 before link failure between S-22 and L-202.
 We can also that the SPF calculation has now run 127 times.
 L-101# sh ip ospf | sec Area
   Area BACKBONE(0.0.0.0)
        Area has existed for 04:56:14
        Interfaces in this area: 3 Active interfaces: 3
        Passive interfaces: 0  Loopback interfaces: 0
        No authentication available
        SPF calculation has run 127 times
         Last SPF ran for 0.001635s
        Area ranges are
        Number of LSAs: 10, checksum sum 0x4e829
Example 1-11: SPF calculation count on L-101 after link failure between S-22 and L-202. 
Capture 1-9 shows the Link State Update sent by S-12. If we compare the flooding process of OSPF to traditional ethernet L2BUM traffic flooding process there is (at least) one significant difference. Technically OSPF speaker does not flood the actual received LS Update. LS Update has always sent to non-routable Multicast address 224.0.0.5 with TTL set to 1, so the original LS Update is only targeted to OSPF speakers in the same network segment. We can see from the capture below that the sending OSPF router is S-12 (192.168.0.12) while the Advertising Router is L-202 (192.168.0.202).

Open Shortest Path First
    OSPF Header
        Version: 2
        Message Type: LS Update (4)
        Packet Length: 76
        Source OSPF Router: 192.168.0.12
        Area ID: 0.0.0.0 (Backbone)
        Checksum: 0x0c27 [correct]
        Auth Type: Null (0)
        Auth Data (none): 0000000000000000
    LS Update Packet
        Number of LSAs: 1
        LSA-type 1 (Router-LSA), len 48
            .000 0000 0000 0100 = LS Age (seconds): 4
            0... .... .... .... = Do Not Age Flag: 0
            Options: 0x02, (E) External Routing
            LS Type: Router-LSA (1)
            Link State ID: 192.168.0.202
            Advertising Router: 192.168.0.202
            Sequence Number: 0x80000007
            Checksum: 0x8552
            Length: 48
            Flags: 0x00
            Number of Links: 2
            Type: Stub     ID: 192.168.32.202  Data: 255.255.255.255 Metric: 1
            Type: PTP      ID: 192.168.0.21    Data: 0.0.0.3         Metric: 40
Capture 1-9: LS Update sent by S-12 captured on L-101. 
When L-101 receives the LS Update from S-12 it sends an LS Acknowledge message to S-12. This is how OSPF does a reliable LS Update process. The same LS Update process happens also with S-11 and L-101 but it is not described here.
Open Shortest Path First
    OSPF Header
        Version: 2
        Message Type: LS Acknowledge (5)
        Packet Length: 44
        Source OSPF Router: 192.168.0.101
        Area ID: 0.0.0.0 (Backbone)
        Checksum: 0xb249 [correct]
        Auth Type: Null (0)
        Auth Data (none): 0000000000000000
    LSA-type 1 (Router-LSA), len 48
        .000 0000 0000 0111 = LS Age (seconds): 7
        0... .... .... .... = Do Not Age Flag: 0
        Options: 0x02, (E) External Routing
        LS Type: Router-LSA (1)
        Link State ID: 192.168.0.202
        Advertising Router: 192.168.0.202
        Sequence Number: 0x80000007
        Checksum: 0x8552
        Length: 48
Capture 1-10: LS Update sent by S-12 captured on L-101.

The effect of single link failure described in the previous section is not the only event that triggers the flooding process. OSPF LSAs have fixed 3600 seconds (one hour) lifetime that can not be changed. However, OSPF routers refreshed each self-originated LSAs every 1800 seconds (half-hour). Imagine that we have Datacenter with 1000, with eight uplinks in each device. If we run single are OSPF in Underlay, there is almost continuous LSA flooding. This though does not trigger SPF calculation but if there is a link failure, each of those 1000 switches will run SPF calculation. This is one of the major problems with OSPF in large scale Datacenter. 

Flood reduction with multiple OSPF Areas

Figure 1-12 shows the OSPF Area design where links between Spine and Leaf switches within Pods belongs to regular, non-backbone OSPF areas (Pod 1 in area 0.0.0.1 and Pod 2 in area 0.0.0.2) and links between Spine and Super-Spine belongs to backbone Area 0.0.0.0. Spine switches in both Pods now become OSPF Area Border Routers (ABR). ABR does not forward any Router LSA (Type-1) or Network LSA (Type-2) between areas. Note that Type-2 is irrelevant in DC since we are only using P2P links where there is no DR/BDR election. What ABR does is that it sends Summary-LSA (Type-3) between areas where it describes the Intra-Area networks. In that sense OSPF routers also advertise routing information (use me as a next-hop for the network x.x.x.x/y) even though OSPF routers inside an area advertise only Link-State information without any kind of routing information.

ABRs S-21, as an example, originates the Summary LSAs (Type-3) about networks located in area 0.0.0.2 and floods it into Area 0.0.0.0. Summary LSA describes the networks and S-21 own cost without having any link-state information included in it. In this sense, when using multiple OSPF areas, the OSPF as a Link-State protocol turns into a Distance-Vector protocol, routers within one have no visibility about the structure of another area.

IN this area design, the link failure between S-22 and L-202 only has a local effect. The LS Updates caused by topology change stays within the OSPF Area 0.0.0.2 and only switches within Pod-2 run the SPF calculation. Neither Super-Spine in area 0.0.0.0 nor Spine in OSPF Area 0.0.0.1 does not run SFP algorithm, they just update the cost information into LSDB and RIB based on a new Summary-LSA received from ABR S-22. This way we can reduce flooding and split the network into smaller failure domains. 




Figure 1-12: OSPF Area design.

If we now take a look at the OSPF LSDB in SS-1 we can see that all Loopback 30 addresses are now seen as Summary Network LSA. But why  Link-Id 192.168.32.202 is still advertised by S-22 even though the link to next-hop switch L-202 is down?

SS-1# sh ip ospf data
        OSPF Router with ID (192.168.0.1) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum Link Count
192.168.0.1     192.168.0.1     854        0x80000006 0xd6e7   4
192.168.0.2     192.168.0.2     850        0x80000007 0xc4f6   4
192.168.0.11    192.168.0.11    1087       0x80000005 0x6cc3   2
192.168.0.12    192.168.0.12    1084       0x80000005 0x5cd1   2
192.168.0.21    192.168.0.21    851        0x80000005 0xcb50   2
192.168.0.22    192.168.0.22    846        0x80000004 0xbd5d   2

                Summary Network Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum
192.168.31.101  192.168.0.11    1247       0x80000003 0xb619
192.168.31.101  192.168.0.12    1244       0x80000003 0xb01e
192.168.31.102  192.168.0.11    1257       0x80000003 0xac22
192.168.31.102  192.168.0.12    1244       0x80000003 0xa627
192.168.32.201  192.168.0.21    621        0x80000003 0x83dc
192.168.32.201  192.168.0.22    626        0x80000003 0x7de1
192.168.32.202  192.168.0.21    611        0x80000004 0x77e6
192.168.32.202  192.168.0.22    16         0x80000005 0x9279
Example 1-12: LSDB on SS-1 in Area Design.

The reason is simple. The LSAs are flooded inside an area 0.0.0.2, so eventually flooded Router LSA about 192.168.32.202/32 ended up into OSPF LSDB of S-22. This is a normal LSDB synchronization process with Link-State protocols, all Intra-area router must have an identical OSPF LSDB. The metric shown in LSDB of S-22 about LSA 192.168.32.202 is now 121 while in stable situation it was 41. This is because the metric is increased by every OSPF routers along the path. The originating router L-202 floods the LSA with its own metric 1 out of its OSPF interfaces. S-21 receives the LS Update and it adds its link cost of an interface where the LSA was received into total metric when installing the LSA into LSDB (In our example network the cost is 40 in each link). Then S-21 runs the SPF algorithm, updates its RIB if necessary, and floods the LS Update out of its OSPF enabled interface to L-201. L-201 installs received LSA into LSDP with metric 81 (received 41 metric plus own link cost 40). Then it runs the SPF algorithm, updates its RIB if necessary, and floods the LSA out its OSFP interface. This way the LSA originated by L-202 eventually end up to LSDB of S-22 with metric 121. In addition to the intra-area flooding process, both ABRs S-21 and S-22 generate Summary LSAs and send those out of their interface attached to the backbone area 0.0.0.0. They are using the same metric in LSAs as what they have in their OSPF LSDB.


Figure 1-13: The Flooding Propagation of Router LSA 192.168.32.202/32.

When SS-1 receives the Summary LSA from ABR S-21 and S-22, it installs both LSAs into OSPF LSDB as is without any modification. Example 1-13 shows that Summary LSA from the ABR S-21 has metric 41 while the Summary LSA from the ABR S-22 has metric 121.

SS-1# sh ip ospf data sum 192.168.32.202 detail
        OSPF Router with ID (192.168.0.1) (Process ID UNDERLAY-NET VRF default)

                Summary Network Link States (Area 0.0.0.0)

   LS age: 646
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Network Summary
   Link State ID: 192.168.32.202 (Network address)
   Advertising Router: 192.168.0.21
   LS Seq Number: 0x80000004
   Checksum: 0x77e6
   Length: 28
   Network Mask: /32
      TOS:   0 Metric: 41

   LS age: 51
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Network Summary
   Link State ID: 192.168.32.202 (Network address)
   Advertising Router: 192.168.0.22
   LS Seq Number: 0x80000005
   Checksum: 0x9279
   Length: 28
   Network Mask: /32
      TOS:   0 Metric: 121
Example 1-13: Summary Net on LSDB of SS-1.

The overall metric (received and link cost) to destination 192.168.32.202/32 via S-21 is the best so it is installed into the RIB. The cost installed into RIB is 81 [cost in received LSA = 41] + [cost to reach the advertising ABR = 40]. If we compare intra-area Router LSAs and inter-area Summary LSAs we can see that the Router LSA describes the advertising routers Links which are used as a source of information when building a Shortest Path Tree. The Summary LSA, in turn, hides the source of information and advertises itself as a next hop for the network described in LSA.
  
SS-1# sh ip route 192.168.32.202/32 | b 192
192.168.32.202/32, ubest/mbest: 1/0
    *via 192.168.0.21, Eth1/4, [110/81], 00:55:54, ospf-UNDERLAY-NET, inter
Example 1-14: RIB of SS-1.

Example 1-15 shows the cost to ABRs on the SS-1 perspective.

SS-1# sh ip ospf UNDERLAY-NET border-routers
OSPF Process ID UNDERLAY-NET VRF default, Internal Routing Table
Codes: i - Intra-area route, I - Inter-area route

intra 192.168.0.11 [40], ABR, Area 0.0.0.0, SPF 122
     via 192.168.0.11, Eth1/1
intra 192.168.0.12 [40], ABR, Area 0.0.0.0, SPF 122
     via 192.168.0.12, Eth1/2
intra 192.168.0.21 [40], ABR, Area 0.0.0.0, SPF 122
     via 192.168.0.21, Eth1/4
intra 192.168.0.22 [40], ABR, Area 0.0.0.0, SPF 122
     via 192.168.0.22, Eth1/3
Example 1-15: SS-1 costs to ABRs.

Figure 1-14 illustrates the Summary LSA propagation process from backbone area 0.0.0.0 to to area 0.0.0.1 (Pod-1). SS-1 and SS-2 flood Summary LSA received from both ABRs S-21 and S-22 to ABRs S-11 and S-12. ABR S-11 and S-12 generate a Summary LSA into Area 0.0.0.1 of LSA generated by S-21 due to its better metric.



Figure 1-14: The Flooding Propagation of Router LSA 192.168.32.202/32.


Example 1-16 shows that the ABR S-11 has received Summary LSAs about 192.168.32.202/32 originated by ABRs on Pod S-21 and S-22. The metric in area 0.0.0.0 is installed as received while in area 0.0.0.1 the cost is increased with the cost to reach the advertising ABR [cost in LSA = 41] + [cost to advertising ABR = 80] = 121. Note that only Summary LSA with lower overall metric, originated by S-21, is used within area 0.0.0.1.


S-11# sh ip ospf data sum 192.168.32.202 detail
        OSPF Router with ID (192.168.0.11) (Process ID UNDERLAY-NET VRF default)

                Summary Network Link States (Area 0.0.0.0)

   LS age: 1108
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Network Summary
   Link State ID: 192.168.32.202 (Network address)
   Advertising Router: 192.168.0.21
   LS Seq Number: 0x80000004
   Checksum: 0x77e6
   Length: 28
   Network Mask: /32
      TOS:   0 Metric: 41

   LS age: 349
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Network Summary
   Link State ID: 192.168.32.202 (Network address)
   Advertising Router: 192.168.0.22
   LS Seq Number: 0x80000005
   Checksum: 0x9279
   Length: 28
   Network Mask: /32
      TOS:   0 Metric: 121


                Summary Network Link States (Area 0.0.0.1)

   LS age: 1111
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Network Summary
   Link State ID: 192.168.32.202 (Network address)
   Advertising Router: 192.168.0.11
   LS Seq Number: 0x80000004
   Checksum: 0xd641
   Length: 28
   Network Mask: /32
      TOS:   0 Metric: 121

   LS age: 1113
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Network Summary
   Link State ID: 192.168.32.202 (Network address)
   Advertising Router: 192.168.0.12
   LS Seq Number: 0x80000004
   Checksum: 0xd046
   Length: 28
   Network Mask: /32
      TOS:   0 Metric: 121
Example 1-16: OSPF LSDB of ABR S-11.

Example 1-17 shows the cost to ABRs on S-11 perspective.

S-11# sh ip ospf border-routers
OSPF Process ID UNDERLAY-NET VRF default, Internal Routing Table
Codes: i - Intra-area route, I - Inter-area route

intra 192.168.0.12 [80], ABR, Area 0.0.0.0, SPF 24
     via 192.168.0.1, Eth1/3
     via 192.168.0.2, Eth1/4
intra 192.168.0.21 [80], ABR, Area 0.0.0.0, SPF 24
     via 192.168.0.1, Eth1/3
     via 192.168.0.2, Eth1/4
intra 192.168.0.22 [80], ABR, Area 0.0.0.0, SPF 24
     via 192.168.0.1, Eth1/3
intra 192.168.0.12 [80], ABR, Area 0.0.0.1, SPF 24
     via 192.168.0.101, Eth1/1
     via 192.168.0.102, Eth1/2
Example 1-17: ABRs seen by S-11.

Because the cost to the specific ABR has already been added when the LSA is installed into LSDB of area, the cost is not modified when it is programmed into the RIB.

S-11# sh ip route 192.168.32.202/32
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.32.202/32, ubest/mbest: 2/0
    *via 192.168.0.1, Eth1/3, [110/121], 01:21:23, ospf-UNDERLAY-NET, inter
    *via 192.168.0.2, Eth1/4, [110/121], 01:21:19, ospf-UNDERLAY-NET, inter
Example 1-18: ABRs seen by S-11.


Example 1-19 shows that L-101 has received and installed Summary LSAs in its OSPF LSDB with received metric.

L-101# sh ip ospf database summary 192.168.32.202 detail
        OSPF Router with ID (192.168.0.101) (Process ID UNDERLAY-NET VRF default)

                Summary Network Link States (Area 0.0.0.1)

   LS age: 1387
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Network Summary
   Link State ID: 192.168.32.202 (Network address)
   Advertising Router: 192.168.0.11
   LS Seq Number: 0x80000004
   Checksum: 0xd641
   Length: 28
   Network Mask: /32
      TOS:   0 Metric: 121

   LS age: 1387
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Network Summary
   Link State ID: 192.168.32.202 (Network address)
   Advertising Router: 192.168.0.12
   LS Seq Number: 0x80000004
   Checksum: 0xd046
   Length: 28
   Network Mask: /32
      TOS:   0 Metric: 121
Example 1-19: Summary Net LSA about 192.168.30.201 on LSDB of L-101.

Because the LSA is received from ABR, L-101 has to check the metric to ABR.

L-101# show ip ospf border-routers
OSPF Process ID UNDERLAY-NET VRF default, Internal Routing Table
Codes: i - Intra-area route, I - Inter-area route

intra 192.168.0.11 [40], ABR, Area 0.0.0.1, SPF 7
     via 192.168.0.11, Eth1/1
intra 192.168.0.12 [40], ABR, Area 0.0.0.1, SPF 7
     via 192.168.0.12, Eth1/2
Example 1-20: ABRs seen by L-101. 


L-101 adds the cost to ABR to route when it is installed into RIB,
[cost in LSA = 121] + [cost to advertising ABR = 40] = 161.

L-101# sh ip route 192.168.32.202
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.32.202/32, ubest/mbest: 2/0
    *via 192.168.0.11, Eth1/1, [110/161], 01:25:18, ospf-UNDERLAY-NET, inter
    *via 192.168.0.12, Eth1/2, [110/161], 01:25:18, ospf-UNDERLAY-NET, inter
Example 1-21: ABRs seen by L-101.

Now the topology information change caused by link failure event between S-22 and L-202 has been propagated to all routers in Datacenter. By using the OSPF area design we reduce the size of the failure domain significantly. Example 1-22 shows that the SPF calculation has been run 59 times in area 0.0.0.0 and 26 times in area 0.0.0.2 by S-21 before link failure. Example 1-23 shows that after link failure both figures have been increased by one.

S-21#  sh ip ospf | sec Area
   Area BACKBONE(0.0.0.0)
        Area has existed for 01:26:03
        Interfaces in this area: 2 Active interfaces: 2
        Passive interfaces: 0  Loopback interfaces: 0
        No authentication available
        SPF calculation has run 59 times
         Last SPF ran for 0.000533s
        Area ranges are
        Number of LSAs: 18, checksum sum 0xa4ad9
   Area (0.0.0.2)
        Area has existed for 00:24:25
        Interfaces in this area: 2 Active interfaces: 2
        Passive interfaces: 0  Loopback interfaces: 0
        No authentication available
        SPF calculation has run 26 times
         Last SPF ran for 0.000367s
        Area ranges are
        Number of LSAs: 8, checksum sum 0x35089
Example 1-22: OSPF Area Information on S-21 before the link failure.

S-21#  sh ip ospf | sec Area
   Area BACKBONE(0.0.0.0)
        Area has existed for 01:26:50
        Interfaces in this area: 2 Active interfaces: 2
        Passive interfaces: 0  Loopback interfaces: 0
        No authentication available
        SPF calculation has run 60 times
         Last SPF ran for 0.048112s
        Area ranges are
        Number of LSAs: 18, checksum sum 0xa4ad9
   Area (0.0.0.2)
        Area has existed for 00:25:12
        Interfaces in this area: 2 Active interfaces: 2
        Passive interfaces: 0  Loopback interfaces: 0
        No authentication available
        SPF calculation has run 27 times
         Last SPF ran for 0.027508s
        Area ranges are
        Number of LSAs: 8, checksum sum 0x42520
S-21#
Example 1-23: OSPF Area Information on S-21 after the link failure. 
However, there is no SPF calculation in SS-1 as can be seen from examples 1-24 (before the link failure) and 1-25 (after the link failure).

SS-1#  sh ip ospf | sec Area
   Area BACKBONE(0.0.0.0)
        Area has existed for 02:20:43
        Interfaces in this area: 4 Active interfaces: 4
        Passive interfaces: 0  Loopback interfaces: 0
        No authentication available
        SPF calculation has run 110 times
         Last SPF ran for 0.000512s
        Area ranges are
        Number of LSAs: 18, checksum sum 0xa4ad9
Example 1-24: Area Information in SS-1 Before the Link Failure.

SS-1#  sh ip ospf | sec Area
   Area BACKBONE(0.0.0.0)
        Area has existed for 02:20:55
        Interfaces in this area: 4 Active interfaces: 4
        Passive interfaces: 0  Loopback interfaces: 0
        No authentication available
        SPF calculation has run 110 times
         Last SPF ran for 0.000512s
        Area ranges are
        Number of LSAs: 18, checksum sum 0xa4ad9
Example 1-25: Area Information in SS-1 After the Link Failure.


The same applies within Leaf switches in area 0.0.0.1 (Pod-1) as can been from examples 1-26 (before the link failure)  and 1-27  (after the link failure).

L-101# sh ip ospf | sec Area
   Area (0.0.0.1)
        Area has existed for 00:24:31
        Interfaces in this area: 3 Active interfaces: 3
        Passive interfaces: 0  Loopback interfaces: 0
        No authentication available
        SPF calculation has run 11 times
         Last SPF ran for 0.001415s
        Area ranges are
        Number of LSAs: 8, checksum sum 0x453a2
Example 1-26: Area Information in L-101 before the link failure.

L-101# sh ip ospf | sec Area
   Area (0.0.0.1)
        Area has existed for 00:27:06
        Interfaces in this area: 3 Active interfaces: 3
        Passive interfaces: 0  Loopback interfaces: 0
        No authentication available
        SPF calculation has run 11 times
         Last SPF ran for 0.001415s
        Area ranges are
        Number of LSAs: 8, checksum sum 0x453a2
Example 1-27: Area Information in L-101 after the link failure.


By splitting the Datacenter OSPF Underlay Network into multiple areas we decrease the stability by limiting the LSA flooding domain caused by link failure or periodic LSA refresh inside one area. There are also some other benefits when using OSPF area design, host routes can be summarized when sending Summary LSAs between areas. By using a command area 0.0.0.2 range 192.168.32.0/24 under the OSPF process in S-21 and S-22 we state that from the area 0.0.0.2 to the backbone we advertise only the subnet 192.168.32.0/24 using only one Summary LSA instead of advertising each VTEP address as a separate LS.

S-22(config)# router ospf UNDERLAY-NET
S-22(config-router)# area 0.0.0.2 range 192.168.32.0/24
Example 1-28: Summarization in S-22.

S-21(config)# router ospf UNDERLAY-NET
S-21(config-router)# area 0.0.0.2 range 192.168.32.0/24
Example 1-29: Summarization in S-21.

Example 1-30 shows that after summarization on ABRs S-21 and S-22, SS-1 now has only subnet 192.168.32.0/24 installed into the RIB, while before summarization, all VTEP loopback addresses were installed as separate RIB entries.

SS-1# sh ip route | sec 192.168.32.
192.168.32.0/24, ubest/mbest: 2/0
    *via 192.168.0.21, Eth1/4, [110/81], 00:02:21, ospf-UNDERLAY-NET, inter
    *via 192.168.0.22, Eth1/3, [110/81], 00:01:15, ospf-UNDERLAY-NET, inter
Example 1-30: RIB of SS-1. 
Example 1-31 shows the OSPF LSDP of S-11 about 192.168.32.0/24 after summarization.

SS-1# sh ip ospf data summary 192.168.32.0 detail
        OSPF Router with ID (192.168.0.1) (Process ID UNDERLAY-NET VRF default)

                Summary Network Link States (Area 0.0.0.0)

   LS age: 490
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Network Summary
   Link State ID: 192.168.32.0 (Network address)
   Advertising Router: 192.168.0.21
   LS Seq Number: 0x80000002
   Checksum: 0x67c3
   Length: 28
   Network Mask: /24
      TOS:   0 Metric: 41

   LS age: 296
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Network Summary
   Link State ID: 192.168.32.0 (Network address)
   Advertising Router: 192.168.0.22
   LS Seq Number: 0x80000003
   Checksum: 0x8256
   Length: 28
   Network Mask: /24
      TOS:   0 Metric: 121
Example 1-31: The Detailed  LS information on SS-1.
 After making aggregation in all ABRs, the LSDB of SS-1 has only four Summary LSAs in its LSDB (example 1-32) while without aggregation there were eight Summary LSAs as can be seen in example 1-12 in page 32.
  
SS-1# sh ip ospf database
        OSPF Router with ID (192.168.0.1) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum Link Count
192.168.0.1     192.168.0.1     530        0x80000009 0xd0ea   4
192.168.0.2     192.168.0.2     535        0x8000000c 0x105e   3
192.168.0.11    192.168.0.11    763        0x80000008 0x66c6   2
192.168.0.12    192.168.0.12    761        0x80000008 0x56d4   2
192.168.0.21    192.168.0.21    582        0x80000009 0xc354   2
192.168.0.22    192.168.0.22    575        0x80000007 0x08b2   1

                Summary Network Link States (Area 0.0.0.0)

Link ID         ADV Router      Age        Seq#       Checksum
192.168.31.0    192.168.0.11    92         0x80000002 0xae87
192.168.31.0    192.168.0.12    71         0x80000002 0xa88c
192.168.32.0    192.168.0.21    1300       0x80000002 0x67c3
192.168.32.0    192.168.0.22    725        0x80000004 0x5dca
Example 1-32: The OSPF LSDB of SS-1.
 Also, the RIB of SS-1 has now only one route advertised by each ABR.

SS-1# sh ip route
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.0.1/32, ubest/mbest: 2/0, attached
    *via 192.168.0.1, Lo0, [0/0], 02:15:16, local
    *via 192.168.0.1, Lo0, [0/0], 02:15:16, direct
192.168.31.0/24, ubest/mbest: 2/0
    *via 192.168.0.11, Eth1/1, [110/81], 00:01:35, ospf-UNDERLAY-NET, inter
    *via 192.168.0.12, Eth1/2, [110/81], 00:01:13, ospf-UNDERLAY-NET, inter
192.168.32.0/24, ubest/mbest: 2/0
    *via 192.168.0.21, Eth1/4, [110/81], 00:21:43, ospf-UNDERLAY-NET, inter
    *via 192.168.0.22, Eth1/3, [110/81], 00:12:08, ospf-UNDERLAY-NET, inter
Example 1-33: The RIB of SS-1.

One thing that can also be done in ABRs is to use not-advertise keyword at the end of an area [area-id] range [network/mask] command. By doing this we can restrict the whole area (Pod) from the Datacenter fabric but that is another story.

Every now and then there will be a need for some router maintenance operation such as software upgrades. One major advantage of networks build by using Layer 3 only is that it is possible to remove a device from the data-path before taking it out of service. With OSPF this can be done by advertising LSAs with infinite metric 65535. When adjacent OSPF speakers receive LSA with infinite metrics, they will replace the previous LSA with this new one which is not used for SPF calculation. This way the advertising OSPF speaker is removed from the data-path in a controlled manner. This is not possible with Layer 2 Control-Plane protocol namely Spanning-Tree. Spanning-Tree does not have build-in mechanisms for this kind of signaling. The same applies to Port-Channel, removing a port from the port-group can’t be signaled beforehand. Even though a Virtualized Switching System has its mechanism for OS upgrade, it is much more complex and more disruptive than the OSPF process described in this section.


Figure 1-15: Taken S-21 Out of the Data-Path in a controlled manner with max-metric.

Example 1-34 shows the RIB of L-201 in a stable situation. All egress traffic is load-balanced between S-21 and S-22.


L-201# sh ip route
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.0.201/32, ubest/mbest: 2/0, attached
    *via 192.168.0.201, Lo0, [0/0], 02:08:56, local
    *via 192.168.0.201, Lo0, [0/0], 02:08:56, direct
192.168.31.101/32, ubest/mbest: 2/0
    *via 192.168.0.21, Eth1/1, [110/161], 00:00:37, ospf-UNDERLAY-NET, inter
    *via 192.168.0.22, Eth1/2, [110/161], 00:03:04, ospf-UNDERLAY-NET, inter
192.168.31.102/32, ubest/mbest: 2/0
    *via 192.168.0.21, Eth1/1, [110/161], 00:00:37, ospf-UNDERLAY-NET, inter
    *via 192.168.0.22, Eth1/2, [110/161], 00:03:04, ospf-UNDERLAY-NET, inter
192.168.32.201/32, ubest/mbest: 2/0, attached
    *via 192.168.32.201, Lo30, [0/0], 02:08:56, local
    *via 192.168.32.201, Lo30, [0/0], 02:08:56, direct
192.168.32.202/32, ubest/mbest: 2/0
    *via 192.168.0.21, Eth1/1, [110/81], 00:00:37, ospf-UNDERLAY-NET, intra
    *via 192.168.0.22, Eth1/2, [110/81], 00:03:04, ospf-UNDERLAY-NET, intra
Example 1-34: The RIB of L-201.

Now we take S-21 out of the data-path by using max-metric router-lsa command under its OSPF process. We can see that S-21 now advertises its P2P link information with infinitive metric 65535. The same operation applies with Summary LSA sent out to area 0.0.0.0.

Internet Protocol Version 4, Src: 192.168.0.21, Dst: 224.0.0.5
Open Shortest Path First
    OSPF Header
    LS Update Packet
        Number of LSAs: 1
        LSA-type 1 (Router-LSA), len 48
            .000 0000 0000 0001 = LS Age (seconds): 1
            0... .... .... .... = Do Not Age Flag: 0
            Options: 0x02, (E) External Routing
            LS Type: Router-LSA (1)
            Link State ID: 192.168.0.21
            Advertising Router: 192.168.0.21
            Sequence Number: 0x80000008
            Checksum: 0xa72c
            Length: 48
            Flags: 0x01, (B) Area border router
            Number of Links: 2
            Type: PTP      ID: 192.168.0.202   Data: 0.0.0.3         Metric: 65535
            Type: PTP      ID: 192.168.0.201   Data: 0.0.0.4         Metric: 65535
Capture 1-12: LS Update sent by S-21 generated by S-21 into Area 0.0.0.2.

Example 1-35 shows that the OSPF LSDB includes Router LSAs from S-21 but now with infinitive metric 65535.



L-201# sh ip ospf database 192.168.0.21 detail
        OSPF Router with ID (192.168.0.201) (Process ID UNDERLAY-NET VRF default)

                Router Link States (Area 0.0.0.2)

   LS age: 7
   Options: 0x2 (No TOS-capability, No DC)
   LS Type: Router Links
   Link State ID: 192.168.0.21
   Advertising Router: 192.168.0.21
   LS Seq Number: 0x8000000c
   Checksum: 0x9f30
   Length: 48
   Area border router
    Number of links: 2

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.202
     (Link Data) Router Interface address: 0.0.0.3
       Number of TOS metrics: 0
         TOS   0 Metric: 65535

     Link connected to: a Router (point-to-point)
     (Link ID) Neighboring Router ID: 192.168.0.201
     (Link Data) Router Interface address: 0.0.0.4
       Number of TOS metrics: 0
         TOS   0 Metric: 65535
Example 1-35: The OSPF LSDB of L-201 After “max-metric router lsa” Command on S-21.

However, L-201, due to the infinitive metric, removes S-21 as a next-hop for any destination from its RIB.

L-201# sh ip route
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.0.201/32, ubest/mbest: 2/0, attached
    *via 192.168.0.201, Lo0, [0/0], 02:09:15, local
    *via 192.168.0.201, Lo0, [0/0], 02:09:15, direct
192.168.31.101/32, ubest/mbest: 1/0
    *via 192.168.0.22, Eth1/2, [110/161], 00:03:23, ospf-UNDERLAY-NET, inter
192.168.31.102/32, ubest/mbest: 1/0
    *via 192.168.0.22, Eth1/2, [110/161], 00:03:23, ospf-UNDERLAY-NET, inter
192.168.32.201/32, ubest/mbest: 2/0, attached
    *via 192.168.32.201, Lo30, [0/0], 02:09:15, local
    *via 192.168.32.201, Lo30, [0/0], 02:09:15, direct
192.168.32.202/32, ubest/mbest: 1/0
    *via 192.168.0.22, Eth1/2, [110/81], 00:03:23, ospf-UNDERLAY-NET, intra
Example 1-36: The RIB of L-201 After “max-metric router lsa” Command  on S-21.

This is a standard to signal adjacent OSPF speakers that “hey neighbors, don’t send any user data to me anymore”. The same behavior can also be done by putting the device in maintenance mode by using the command isolate under the OSPF process or putting the device in maintenance mode by using global configuration command system mode maintenance that starts a script that gracefully puts every Control-Plane protocol into maintenance mode. Example 1-37 illustrates this process with OSPF.


S-21(config)# system mode maintenance

Following configuration will be applied:

router ospf UNDERLAY-NET
  isolate

Do you want to continue (yes/no)? [no] yes

Generating before_maintenance snapshot before going into maintenance mode

Starting to apply commands...

Applying : router ospf UNDERLAY-NET
Applying :   isolate2020 Jul  1 12:31:58.916066 ospf: UNDERLAY-NET [2139]   Maintenance mode is Enabled


Maintenance mode operation successful.

Waiting 120 seconds to allow network re-routing to occur before releasing CLI
.2020 Jul  1 12:32:02 S-21 %$ VDC-1 %$ %MMODE-2-MODE_CHANGED: System changed to "maintenance" mode.
.......................done
Example 1-37: Starting Maintenace Mode on S-21.

Note that the OSPF adjacency is not affected.

S-21(maint-mode)(config)# sh ip ospf neighbors
 OSPF Process ID UNDERLAY-NET VRF default
 Total number of neighbors: 4
 Neighbor ID     Pri State            Up Time  Address         Interface
 192.168.0.201     1 FULL/ -          01:07:55 192.168.0.201   Eth1/1
 192.168.0.202     1 FULL/ -          01:07:50 192.168.0.202   Eth1/2
 192.168.0.1       1 FULL/ -          01:07:51 192.168.0.1     Eth1/3
 192.168.0.2       1 FULL/ -          01:07:52 192.168.0.2     Eth1/4
Example 1-38: OSPF Adjacencies in Maintenance Mode on S-21.


LSA and SPF timers


This section introduces timers related to Link-State Advertisement and Shortest Path Calculation. Even though these timers can and should be used with default values, it is good to understand what these timers are used for. I am going to use the same old example of link failure between S-22 and L-202 but instead of a link-down event, we now have a flapping link.


LSA Throttling Timer helps In a situation where one of the connected links is unstable by delaying self-originated LSAs. Figure 1-16 illustrates the situation from the S-22 perspective (note that flapping link also affects to L-202). The first flap will cause S-22 to send LSA out of OSPF interfaces based on LSA Throttling Start Time which default value in NX-OS  is 0.00ms. This means that the first LSA will be sent without any delay. LSA Throttling Hold Time related to the second LSA defines the actual delay meaning the second LSA will be sent 1000 ms after the first one. The delay of sending the third LSA is calculated by 2 * Hold Time, so in our example, the third LSA will be sent 2 seconds after the second one. The delay of sending the fourth LSA is calculated by 4 * Hold Time, so in our example, the fourth LSA will be sent 4 seconds after the second one. If there is a fifth interface flapping event, the LSA generation is a delay based on LSA Throttling Maximum Wait Time that defines the maximum delay for self-originated LSAs. The default values for LSA throttling are (1) Start Time: 0.00 ms, (2) Hold Time: 5000 ms, and (3) Maximum Wait: 5000 ms. I just use Hold Time 1000 ms for the sake of simplicity in this example. The timers will be restarted after 2 * Maximum Wait Time has been elapsed without link failure. Note that L-201 rejects the LSAs about the same instance if they are received at the interval smaller than 1000ms (MinLSArrival).


Figure 1-16: LSA Throttling and MinLSArrival Timers.



SPF Throttling Timer delays the SPF calculation. It uses the same kind of exponential backoff timer than what is used with LSA Throttling. SPF Throttling Delay Time defines the time to delay SPF calculation when the first LSA is received. In our example, L-201 waits 200 ms after the first LSA is received. After 200 ms it runs the SPF algorithm. If the next LSA arrives within the next 5000 ms (Maxi Wait Time), L-201 waits 1000 ms for consequent LSA before the next SPF run. And the same process recurs, if the next LSA arrives within the next 5000 ms, L-201 now delays SPF run by 2000 ms (2 * Hold Time) ms for consequent LSAs. The next delay will be 4000 ms (4 * Hold Time). The SPF Throttling Maximum Wait Time defines the maximum delay between consequence SPF runs. This way L-201 can include more than just one network change in a single SPF calculation. Note that the LSDB is locked during the SPF calculation and LSAs received during calculation will be buffered. 

Figure 1-17: SPF Throttling Timers.




The two consecutive LSAs received from adjacent OSPF speakers are paced by the Flood Pacing Timer which default time is 33 ms. In figure 1-18, L-201 floods LSAs received from its adjacent OSPF routers S-22, at the interval of 33 ms out its OSPF interface. You might wonder why this is important because L-201 floods LSAs only from S-22 to S-21 and another way around and those LSAs flooding interval is paced to 33 ms. Also, S-21 and S-22 use LSA throttling for their self-originated LSAs. So it looks like there never will be any need for Flood Pacing Timer. However, in reality, there might be 20 Leaf/ToR switches in one Pod and six Spine switches, then we have a totally different situation, and the need for Flood Pacing Timer is more obvious.

Figure 1-18: Flood Pacing Timer.





All Link-State entries in OSPF LSDB has an individual aging time. The aging time is set to zero by the OSPF speaker that owns the link when the link comes up for the first time. Then it is advertised out of OSPF interfaces. The aging time increases by one every second. In addition, adjacent routers increase aging time by one when they flood the LSA further. The maximum aging time for each LS is 3600 seconds and if the max-age is reached, the LS is removed from the LSDB. The max-age time, however, should not be seen because OSPF uses Link State Refresh timer to refresh self-originated LSAs when their age reaches 1800 seconds (Max-Age/2). Advertising each LS individually when its timer reaches 1800 seconds is insufficient and could generate a huge amount of flooding traffic. The solution for this is the LSA Group Pacing Timer. In figure 1-19, L-201 has three self-originated LS information in its LSDB. When LS-1 aging time reaches 1800 seconds, L-201 starts the LSA Group Pacing Timer that delays the LSA by 10 seconds. When eight seconds have been elapsed, also LS-2 reaches an aging time in 1800. Then after two seconds, L-201 sends an LSA that carries both LS-1 and LS-2 out of its OSPF interfaces. The LS-3 will be advertised later when its aging time reaches 1800 seconds. 

Figure 1-19: LSA Group pacing.




L-101# sh ip ospf UNDERLAY-NET

 Routing Process UNDERLAY-NET with ID 192.168.0.101 VRF default
 Routing Process Instance Number 1
 Stateful High Availability enabled
 Graceful-restart is configured
   Grace period: 60 state: Inactive
   Last graceful restart exit status: None
 Supports only single TOS(TOS0) routes
 Supports opaque LSA
 Administrative distance 110
 Reference Bandwidth is 40000 Mbps
 SPF throttling delay time of 200.000 msecs,
   SPF throttling hold time of 1000.000 msecs,
   SPF throttling maximum wait time of 5000.000 msecs
 LSA throttling start time of 0.000 msecs,
   LSA throttling hold interval of 5000.000 msecs,
   LSA throttling maximum wait time of 5000.000 msecs
 Minimum LSA arrival 1000.000 msec
 LSA group pacing timer 10 secs
 Maximum paths to destination 8
 Number of external LSAs 0, checksum sum 0
 Number of opaque AS LSAs 0, checksum sum 0
 Number of areas is 1, 1 normal, 0 stub, 0 nssa
 Number of active areas is 1, 1 normal, 0 stub, 0 nssa
 Install discard route for summarized external routes.
 Install discard route for summarized internal routes.
 Default Passive Interface is enabled
   Area (0.0.0.1)
        Area has existed for 00:53:10
        Interfaces in this area: 3 Active interfaces: 3
        Passive interfaces: 0  Loopback interfaces: 0
        No authentication available
        SPF calculation has run 11 times
         Last SPF ran for 0.001415s
        Area ranges are
        Number of LSAs: 8, checksum sum 0x443aa
Example 1-39: Default OSPF Timers.

Summary


This chapter explains how OSPF speakers form adjacency and how they run SPF and forms the RIB. This chapter also describes how we can minimize the flooding by splitting the Underlay Network into multiple OSPF areas as well as how we can use summarization between areas. The last part of this section introduces timers related to SPF and LSA. 



4 comments:

  1. In a number of your figures, the red circles say SFP instead of SPF.

    ReplyDelete
  2. Okay informative article, Thanks for discussing, I must say I enjoyed this. Carry on the wonderful job, I have bookmarked your site for upcoming appointments.

    teqhow

    ReplyDelete
    Replies
    1. Thanks for visiting David, I'm happy that you liked it.

      Delete