Tuesday, 16 June 2026

Chapter 1: SONiC Fundamentals

Introduction

SONiC (Software for Open Networking in the Cloud) is a Linux-based open-source network operating system that was originally developed at Microsoft and is now maintained by a broader open-source community. Its core idea is that the same network operating system can run on switch platforms from multiple hardware vendors. This reduces vendor lock-in and provides a more consistent operational model across different environments.

SONiC can also be viewed as an abstraction layer between network operators and the underlying switch hardware. Instead of learning and managing several vendor-specific operating systems, operators can use a common software architecture and management model across different switch platforms. This simplifies network operations, automation, monitoring, and telemetry collection. It can also reduce operational errors caused by configuration differences between platforms and make it easier to onboard new engineers.

Organizations can choose the hardware platform that best meets their technical, operational, and business requirements without being tied to a single software ecosystem. Some vendors provide commercially supported SONiC distributions together with professional support services, while others support community-based deployments or customer-tailored implementations. The appropriate model depends on the organization's operational requirements and support expectations.

From an architectural perspective, SONiC is a modular and container-based system. Major functional components, such as routing, switching, platform management, and monitoring, run in dedicated containers. Configuration, operational state, and system events are exchanged primarily through Redis databases and associated publisher-subscriber mechanisms. Because many system functions communicate through these databases, SONiC is often described as a database-oriented network operating system.

This architecture separates common network software functions from platform-specific implementation details. Control-plane applications generate information that is stored in the application databases and later used to program forwarding behavior in the underlying hardware. The result is a flexible software architecture that can support multiple hardware platforms while maintaining a consistent operational model.

Although SONiC provides a common software architecture, the same software image cannot run on every switch platform without platform-specific support. During startup, SONiC must identify the platform on which it is running, discover available hardware components, and learn their capabilities. This information is typically provided through platform-specific EEPROM data, platform drivers, and hardware management interfaces.

Many platforms use the I²C bus for functions such as hardware inventory collection, transceiver access, thermal monitoring, and power management. Depending on the platform design, some hardware management functions may also involve a Baseboard Management Controller (BMC). The implementation details vary between vendors, but the overall goal remains the same: providing SONiC with accurate information about the hardware resources available on the switch.


SONiC Microservice Architecture Overview

SONiC is a modular network operating system in which the main switch functions are divided into separate Docker containers. These functions include routing, neighbor discovery, link aggregation, monitoring, platform management, database services, switch state orchestration, and synchronization with the hardware abstraction layer. The containers run in Linux user space and together form SONiC's microservice-based architecture.

Inter-container communication is largely based on a centralized Redis database service, which runs in its own database container. Containers publish their service-specific information to Redis databases and subscribe to the changes they need. In this role, Redis acts as SONiC's internal distribution point for state information, configuration, and events. The database container can be understood as providing both state storage and event distribution services to the other containers, although Redis itself is not merely a message queue but also a central part of SONiC's data model.

Figure 1-1 summarizes this modular structure and shows how the main containers relate to Linux user space, Linux kernel space, and the underlying hardware platform. The Docker containers running in user space can be roughly grouped according to the components they communicate with and the role they play in the overall SONiC architecture.

One group consists of application-facing containers such as SNMP, LLDP, teamd, and BGP. These containers communicate either with external systems or with neighboring network devices. The SNMP container handles SNMP queries and responses exchanged with external management systems. The LLDP container is responsible for discovering neighboring devices and exchanging capability information. The teamd container manages link aggregation and logical port channels, while the BGP container exchanges routing information with its routing peers. These containers publish the information they generate to Redis databases, where other SONiC components can read it. 

The upper part of the figure also shows the dhcp-relay container, whose role is to relay DHCP messages when the DHCP server is located in a different subnet from the client device.

Another group consists of infrastructure-oriented containers such as pmon, swss, and syncd. These containers are more closely related to Linux kernel-space drivers, platform-specific hardware management, and switch ASIC programming. Through these containers, SONiC gathers information about components such as fans, power supplies, LEDs, optical transceivers, and other hardware elements. At the same time, they participate in the process in which higher-level control information is eventually translated into hardware state that can be programmed into the ASIC.

Among these containers, swss (Switch State Service) has a particularly central role. It acts as an intermediate layer between application logic and the state that is eventually programmed into the hardware. For example, the BGP container may publish routes to the APPL_DB database. The orchagent component in the swss container reads this information, converts it into a form suitable for ASIC programming, and publishes the result to the ASIC_DB database.

In the final stage, the syncd container reads the information published to ASIC_DB and passes it through the SAI interface to the vendor's ASIC SDK. The vendor's ASIC SDK, together with the platform-specific ASIC driver, programs the required state into the physical switch ASIC. In this way, a route learned from a BGP neighbor passes through several SONiC components and database layers before it is finally programmed into the hardware forwarding table.

It is also worth noting that not all SONiC components are containerized. Some configuration-related tools, such as the SONiC CLI and configuration generation logic, run directly on the Linux host system.

The key point is that SONiC does not rely on one large monolithic network process. Instead, it separates major functions into containers and connects them through a shared database model built on Redis. This separation makes the system easier to extend, monitor, restart, and adapt to different switch platforms.

Figure 1-1: SONiC Micro-Service Architecture Overview


SONiC Database-Oriented Design


The Redis server runs inside the Database container. Figure 1-2 shows a single-instance Redis model used in SONiC, where a single Redis server hosts several logical databases for different purposes. Redis provides a lightweight in-memory data store with fast access and simple inter-process communication, making it well suited for SONiC's modular container-based architecture.

APPL_DB stores application-level state, such as route entries published by the BGP container. ASIC_DB stores SAI-oriented objects that represent the state to be programmed into the switch ASIC.
Although APPL_DB and ASIC_DB may contain information related to the same network function, such as routing, they do not communicate directly with each other. Instead, the swss container provides the orchestration logic that connects application-level state to hardware-oriented state. The central component in this process is orchagent, which monitors relevant APPL_DB tables, processes updates, and writes the corresponding SAI-level objects to ASIC_DB.

A simplified route update flow works as follows. When the BGP container learns, updates, or withdraws a route, its synchronization logic publishes the route update to APPL_DB. orchagent receives the update from APPL_DB, translates the route information into ASIC-programming intent, and writes the result to ASIC_DB. In this role, orchagent acts both as a consumer of application-level updates and as a producer of hardware-oriented database entries.

After the route-related objects have been written to ASIC_DB, syncd consumes the ASIC_DB updates and passes them through the SAI interface to the vendor ASIC SDK. The vendor ASIC SDK, together with the platform-specific driver stack, then programs the required forwarding state into the physical switch ASIC. This example illustrates the producer-consumer pattern used throughout SONiC's database-oriented architecture. 

In practice, database-oriented means that SONiC services exchange much of their configuration, operational state, and event information through Redis databases rather than through direct service-to-service calls.

Other key SONiC databases include CONFIG_DB, STATE_DB, and COUNTERS_DB. CONFIG_DB stores the switch configuration. During startup, it is populated from JSON-formatted configuration data, while subsequent updates may originate from SONiC management interfaces. STATE_DB stores operational state reported by SONiC processes, while COUNTERS_DB stores counters and telemetry-related statistics. These databases are discussed in more detail in later sections.
The relationship between these logical databases in the single-instance Redis model is illustrated in Figure 1-2.

Figure 1-2: Single-Instance Redis Database Model in SONiC.

In the single-instance model, different containers publish and consume data through logical databases that share the same Redis process, CPU resources, and UNIX socket. Under heavy load, such as during large route update bursts, this shared model can create contention and delay database operations for other functions. To reduce this risk in larger or more demanding deployments, SONiC also supports a multi-instance Redis architecture.

In the multi-instance model, databases can be organized according to workload characteristics such as read/write frequency and CPU utilization. Figure 1-3 illustrates one possible grouping of Redis instances based on workload characteristics. APPL_DB and ASIC_DB are shown in a High-Churn Processing Instance, CONFIG_DB and STATE_DB in a Management and Slow-State Instance, and COUNTERS_DB in a High-Frequency Telemetry Instance. This separation allows each Redis instance to expose its own UNIX socket and also allows deployments to apply CPU-affinity policies when needed. As a result, high-frequency telemetry collection or large route-update bursts are less likely to interfere with slower management and state operations.


Figure 1-3: Multi-Instance Redis Database Model in SONiC.

SONiC's architecture is built around modular services, shared databases, and a hardware abstraction layer that separates common network functions from platform-specific implementation details. Understanding these fundamentals makes it easier to follow later topics such as installation, startup, hardware discovery, configuration handling, and operational troubleshooting.



Monday, 25 May 2026

SONIC Part III: SONiC Introduction

SONiC is a vendor-neutral, Linux-based network operating system (NOS) that uses a database-driven architecture. Its software components run in multiple containers and exchange information through Redis. In SONiC, several named databases are defined for different functions, and these databases are mapped to Redis logical database IDs. Through this design, configuration data, application state, operational state, and ASIC-related state move between software layers by means of specialized processes.

Different hardware vendors may add their own platform integrations, transceiver support, monitoring utilities, or management workflows. However, the core SONiC architecture remains the same. This is one of the main reasons why SONiC knowledge, troubleshooting methods, and automation practices are transferable across different hardware platforms.

Vendor neutrality does not mean that every SONiC-based implementation behaves exactly the same in every operational detail. It means that different implementations follow the same architectural model. To organize information clearly, SONiC defines several named databases, each of which is mapped to a Redis logical database ID:

·       CONFIG_DB (Redis DB 4): Stores the user’s intended configuration.

·       APPL_DB (Redis DB 0): Stores application-level objects that are ready for processing by lower software layers.

·       STATE_DB (Redis DB 6): Stores operational state information about system components.

·       ASIC_DB (Redis DB 1): Stores objects in a form used by the SONiC and SAI pipeline for hardware programming.

Figure 1-01 shows the relationship between Redis logical databases and SONiC databases from a routing-oriented point of view. A standard Redis instance commonly provides sixteen logical databases by default, and SONiC uses a defined subset of them for its core functions.

As shown in the routing example in Figure 1-01, APPL_DB contains a native Redis Set called ROUTE_TABLE_KEY_SET. This set tracks route-related keys, but it does not store route attributes such as next-hop or metric values. The actual routes are stored as separate Redis keys that follow SONiC’s table-and-key naming convention, where the table name and the object identifier are joined with a colon. Examples include ROUTE_TABLE:192.168.1.1/32 for a host route and ROUTE_TABLE:10.1.1.0/30 for a network route. Route attributes are stored as field-value pairs in a Redis Hash, which SONiC uses to represent structured objects in its databases.

The following chapters build on this foundation. First, we examine what happens when an interface changes from down to up and receives an IP address configuration. Next, we trace the internal processes that begin when a BGP session is established and the system starts handling BGP UPDATE messages. By moving from interface bring-up to control-plane route learning, you will see how configuration data and protocol state pass through the software layers until the resulting forwarding information is programmed into the switch hardware.

Figure 1-01 also includes config_db.json to indicate that persistent configuration is stored outside Redis and is loaded into CONFIG_DB during startup, while the detailed workflow is covered in later sections.


Figure 1-01: Relationship Between Redis Logical Databases and SONiC Databases.

Monday, 4 May 2026

SONiC Part II: Deploy a SONiC Switch Clos Topology

 

Introduction

 

This chapter explains how to create and deploy a simple SONiC-based Clos topology in WSL using Containerlab. First, we open VS Code from WSL to create and edit a topology definition file. Next, we build the topology by defining nodes (SONiC switches and Linux hosts) and the links between them. Before deploying the lab, we verify the wiring with Containerlab’s built-in topology graph. Finally, we deploy the topology and validate access to the nodes using both a Linux shell and the SONiC CLI (vtysh).

Phase 1: Integrate VS Code with WSL




There are a couple of ways to use VS Code with WSL. In this lab, we launch VS Code from the WSL terminal using code .. The first time you run this command, VS Code installs the VS Code Server components inside WSL and then opens a VS Code window connected to the Linux environment. After the installation completes, running code . from any directory opens that folder directly in VS Code.

nwkt@Toni:~$ code .

Updating VS Code Server to version 034f571df509819cc10b0c8129f66ef77a542f0e

Removing previous installation...

Installing VS Code Server for Linux x64 (034f571df509819cc10b0c8129f66ef77a542f0e)

Downloading: 100%

Unpacking: 100%

Unpacked 3505 files and folders to /home/nwkt/.vscode-server/bin/034f571df509819cc10b0c8129f66ef77a542f0e.

Looking for compatibility check script at /home/nwkt/.vscode-server/bin/034f571df509819cc10b0c8129f66ef77a542f0e/bin/helpers/check-requirements.sh

Running compatibility check script

Compatibility check successful (0)

nwkt@Toni:~$

Example 2-3: Open VS Code from WSL (install VS Code Server on first run).

 

Phase 2: Create Topology File

It is a good practice to create a consistent folder structure for your lab projects. Example 2-1 shows a simple directory layout using the tree command. If tree is not installed, you can add it with sudo apt install tree.

 

nwkt@Toni:~$ tree

.

├── clos-lab

   ├── host-config

   └── switch-config

└── snap

 

5 directories, 0 files

 

Example 2-1: Project folder structure.

After creating the folder structure, run code . from the clos-lab directory to open VS Code in the correct working folder. In VS Code, create a new file and name it lab-1.clab.yml (or another name ending in .clab.yml). Because VS Code was opened from the correct folder, the file is saved directly under clos-lab.


Figure 2-1: VS Code: open a new file.

Next, use the Ctrl+K, M keyboard shortcut to open the language mode selection drop-down menu and select YAML.


Figure 2-2: VS Code: select language mode.

 

A Containerlab topology file defines the nodes to start (and their container images) and how those nodes are connected with links. The file begins with a lab name, for example name: nwkt-01. Containerlab uses this value as part of the container naming convention. For example, the node spine-1 is created as clab-nwkt-01-spine-1.

Under the topology: key, the nodes: section defines each node. In this chapter we use kind: sonic-vs with image: docker-sonic-vs:latest for the SONiC switches, and kind: linux with image: alpine:latest for the hosts. A node’s kind tells Containerlab how to boot the node and what features it supports. It also affects how interface names are interpreted for link endpoints.

When using kind: sonic-vs, Containerlab connects the container’s management interface to its management network on eth0. Data-plane interfaces start at eth1 and map to SONiC front-panel ports. For example, in a sonic-vs container eth1 maps to Ethernet0 and eth2 maps to Ethernet4. This is why the links in Example 2-2 use Linux-style names such as spine-1:eth1 and leaf-1:eth1.

The links: section describes how nodes are wired together. Each link has two endpoints. For example, endpoints: ["spine-1:eth1", "leaf-1:eth1"] creates a point-to-point link between spine-1 and leaf-1 using their first data-plane interfaces.

 

name: nwkt-01

topology:

  nodes:

    spine-1:

      kind: sonic-vs

      image: docker-sonic-vs:latest

    leaf-1:

      kind: sonic-vs

      image: docker-sonic-vs:latest

    leaf-2:

      kind: sonic-vs

      image: docker-sonic-vs:latest

    host-1:

      kind: linux

      image: alpine:latest

    host-2:

      kind: linux

      image: alpine:latest

 

  links:

    # Connections for Leaf-1

    - endpoints: ["spine-1:eth1", "leaf-1:eth1"]

    - endpoints: ["leaf-1:eth2", "host-1:eth1"]

    # Connections for Leaf-2

    - endpoints: ["spine-1:eth2", "leaf-2:eth1"]

    - endpoints: ["leaf-2:eth2", "host-2:eth1"]

Example 2-2: Containerlab topology file: lab-1.clab.yml.


Containerlab topology files typically use the .clab.yml or .clab.yaml extension. When you run containerlab deploy without specifying a topology file, Containerlab looks for a single .clab.yml or .clab.yaml file in the current directory. If multiple matching files exist, use -t to select the desired file (for example, containerlab deploy -t lab-1.clab.yml). Using the .yml extension is common, but .yaml works as well.

Create the topology file as shown in Example 2-2. VS Code provides indentation guides and syntax highlighting for YAML, which makes the file easier to read and helps you avoid indentation errors. Save the file in the clos-lab folder.


Figure 2-3: VS Code YAML editing with indentation and syntax highlighting.

 

 

nwkt@Toni:~$ tree

.

├── clos-lab

   ├── host-config

   ├── lab-1.clab.yml

   └── switch-config

└── snap

 

5 directories, 1 file

 

Example 2-4: Folder and file structure.

 

Phase 3: Verify Wiring



Before deploying the topology, it is a good idea to verify that the wiring is correct. Containerlab includes a built-in visualization tool that generates a graphical representation of the topology. The command sudo containerlab graph -t lab-1.clab.yml starts a small local web server (by default on port 50080) and prints one or more URLs you can open in a browser. This is a useful sanity check before deployment, for example, to confirm that spine-1 is connected to the correct interface on leaf-1.

 

nwkt@Toni:~/clos-lab$ sudo containerlab graph -t lab-1.clab.yml

13:57:22 INFO Parsing & checking topology file=lab-1.clab.yml

13:57:22 INFO Serving topology graph

  addresses=

  │   http://10.255.255.254:50080

  │   http://172.25.109.88:50080

  │   http://172.17.0.1:50080

  │   http://172.20.20.1:50080

  │   http://[3fff:172:20:20::1]:50080

Example 2-5: Generate a graphical topology view.

Figure 2-4: Graphical topology view (URL http://172.25.109.88:50080 ).

 

Phase 4: Deploy Topology File



After saving lab-1.clab.yml, deploy the lab with sudo containerlab deploy (or explicitly specify the file with -t lab-1.clab.yml). Containerlab parses the topology file, creates a lab directory (clab-<lab-name>), starts the containers, and connects them with the defined links. In the summary table, the Name column shows the full container names (used with docker commands), and the IPv4/6 Address column shows the management IP addresses assigned on the Containerlab management network.

 

nwkt@Toni:~/clos-lab$ sudo containerlab deploy

11:54:41 INFO Containerlab started version=0.74.3

11:54:41 INFO Parsing & checking topology file=lab-1.clab.yml

11:54:41 INFO Creating lab directory path=/home/nwkt/clos-lab/clab-nwkt-01

11:54:41 INFO Creating container name=host-1

11:54:41 INFO Creating container name=host-2

11:54:41 INFO Creating container name=leaf-1

11:54:41 INFO Creating container name=leaf-2

11:54:41 INFO Creating container name=spine-1

11:54:42 INFO Created link: spine-1:eth1 ▪┄┄ leaf-1:eth1

11:54:42 INFO Created link: leaf-1:eth2 ▪┄┄ host-1:eth1

11:54:43 INFO Created link: spine-1:eth2 ▪┄┄ leaf-2:eth1

11:54:43 INFO Created link: leaf-2:eth2 ▪┄┄ host-2:eth1

11:54:43 INFO Adding host entries path=/etc/hosts

11:54:43 INFO Adding SSH config for nodes path=/etc/ssh/ssh_config.d/clab-nwkt-01.conf

11:54:43 INFO containerlab version

  🎉=

  │ A newer containerlab version (0.75.0) is available!

  │ Release notes: https://containerlab.dCustomerev/rn/0.75/

  │ Run 'clab version upgrade' or see https://containerlab.dev/install/ for other installation options.

──────────────────────────────────────────────────────────────────────────

│         Name         │       Kind/Image       │  State  │   IPv4/6 Address  │

──────────────────────────────────────────────────────────────────────────

│ clab-nwkt-01-host-1  │ linux                  │ running │ 172.20.20.2       │

│                      │ alpine:latest          │         │ 3fff:172:20:20::2 │

──────────────────────────────────────────────────────────────────────────

│ clab-nwkt-01-host-2  │ linux                  │ running │ 172.20.20.6       │

│                      │ alpine:latest          │         │ 3fff:172:20:20::6 │

──────────────────────────────────────────────────────────────────────────

│ clab-nwkt-01-leaf-1  │ sonic-vs               │ running │ 172.20.20.5       │

│                      │ docker-sonic-vs:latest │         │ 3fff:172:20:20::5 │

──────────────────────────────────────────────────────────────────────────

│ clab-nwkt-01-leaf-2  │ sonic-vs               │ running │ 172.20.20.4       │

│                      │ docker-sonic-vs:latest │         │ 3fff:172:20:20::4 │

──────────────────────────────────────────────────────────────────────────

│ clab-nwkt-01-spine-1 │ sonic-vs               │ running │ 172.20.20.3       │

│                      │ docker-sonic-vs:latest │         │ 3fff:172:20:20::3 │

──────────────────────────────────────────────────────────────────────────

nwkt@Toni:~/clos-lab$

Example 2-6: Topology deployment output.


 After deploying the topology, you can use tree to review the lab directory and related files created during the deployment.


nwkt@Toni:~$ tree

.

├── clos-lab

│   ├── clab-nwkt-01

│   │   ├── ansible-inventory.yml

│   │   ├── authorized_keys

│   │   ├── leaf-1

│   │   ├── leaf-2

│   │   ├── nornir-simple-inventory.yml

│   │   ├── spine-1

│   │   └── topology-data.json

│   ├── host-config

│   ├── lab-1.clab.yml

│   └── switch-config

└── snap

 

9 directories, 5 files

Example 2-7: Updated folder structure after deployment.


Example 2-8 shows how to verify the status of the containers using docker ps. The --format option prints a readable table with the container ID, name, and status.


nwkt@Toni:~$ docker ps -a --format "table {{.ID}}\t{{.Names}}\t{{.Status}}"

CONTAINER ID   NAMES                  STATUS

c67cbb5fe8e8   clab-nwkt-01-host-1    Up 36 minutes

1696b2865f8e   clab-nwkt-01-host-2    Up 36 minutes

b7517c417137   clab-nwkt-01-leaf-2    Up 36 minutes

810267f0cf2b   clab-nwkt-01-leaf-1    Up 36 minutes

60c37f941005   clab-nwkt-01-spine-1   Up 36 minutes

0c01df3ef211   adoring_brattain       Exited (0) 6 days ago

Example 2-8: List containers and verify status.

 

Phase 5: Test Connection – Log In to Nodes



As a final step, verify that you can access the nodes. To open a Linux shell inside a node container, run docker exec -it clab-nwkt-01-leaf-1 bash. From the shell, start the SONiC CLI by running vtysh. You can also start the CLI directly with docker exec -it clab-nwkt-01-leaf-1 vtysh.

 

nwkt@Toni:~$ docker exec -it clab-nwkt-01-leaf-1 bash

root@leaf-1:/#

root@leaf-1:/#

root@leaf-1:/# vtysh

 

Hello, this is FRRouting (version 10.0.1).

Copyright 1996-2005 Kunihiro Ishiguro, et al.

   <snipped for brevity>

leaf-1#

leaf-1#

leaf-1# sh run

Building configuration...

 

Current configuration:

!

frr version 10.0.1

frr defaults traditional

hostname leaf-1

domainname localdomain

no ipv6 forwarding

no zebra nexthop kernel enable

fpm address 127.0.0.1

no fpm use-next-hop-groups

service integrated-vtysh-config

!

ip nht resolve-via-default

!

ipv6 nht resolve-via-default

!

end

leaf-1#

Example 2-9: Log in to a node and open the SONiC CLI.