Wednesday, 8 October 2025

Ultra Ethernet: Fabric Object - What it is and How it is created

Fabric Object


Fabric Object Overview

In libfabric, a fabric represents a logical network domain, a group of hardware and software resources that can communicate with each other through a shared network. All network ports that can exchange traffic belong to the same fabric domain. In practice, a fabric corresponds to one interconnected network, such as an Ethernet or Ultra Ethernet Transport (UET) fabric.

A good way to think about a fabric is to compare it to a Virtual Data Center (VDC) in a cloud environment. Just as a VDC groups together compute, storage, and networking resources into an isolated logical unit, a libfabric fabric groups together network interfaces, addresses, and transport resources that belong to the same communication context. Multiple fabrics can exist on the same system, just like multiple VDCs can operate independently within one cloud infrastructure.

The fabric object acts as the top-level context for all communication. Before an application can create domains, endpoints, or memory regions, it must first open a fabric using the fi_fabric() call. This creates the foundation for all other libfabric objects.

Each fabric is associated with a specific provider,  for example, libfabric-uet, which defines how the fabric interacts with the underlying hardware and network stack. Once created, the fabric object maintains provider-specific state, hardware mappings, and resource visibility for all subsequent objects created under it.

For the application, the fabric object is simply a handle to a network domain that other libfabric calls will use. For the provider, it is the root structure that connects all internal data structures and controls how communication resources are managed within the same network fabric.

The following section explains how the application requests a fabric object and how the provider and libfabric core work together to create and publish it.


Creating Fabric Object

After the UET provider populates the fi_info structures for each NIC/port combination during discovery, the application can begin creating objects. It first consults the fi_info list to identify the entry that best matches its requirements. Figure 4-3 shows an illustrative example of how the application calls fi_fabric() to request a fabric object based on the fi_fabric_attr sub-structure of fi_info[0] corresponding to NIC Eth0.

Once the application issues the API call, the request is handed over to the libfabric core, which acts as a lightweight coordinator and translator between the application and the provider.

The UET provider receives the request via the uet_fabric function pointer. It may first verify that the requested fabric name is still valid and supported by NIC 0 before consulting the fi_fabric_attr structure specified in the API call. The provider then creates the fabric object, defining its type, fabric ID (fid), provider, and fabric name.


Figure 4-3: Objects Creation Process – Fabric for Cluster.

Memory Allocation and Object Publication


In libfabric, memory for fabric objects — as well as for all other objects created later, such as domains, endpoints, and memory regions — is allocated by the provider, not the libfabric core. The provider is responsible for creating the actual data structures that represent these objects and for mapping them to underlying hardware resources. This ensures that the object’s state, capabilities, and hardware associations are maintained correctly and consistently.

When the provider creates a fabric object, it allocates memory in its own address space and initializes all internal fields, including type, fabric ID (fid), provider-specific metadata, and associated NIC resources. Once the object is fully initialized, the provider returns a pointer to the libfabric core. This pointer effectively tells the core the location of the object in memory.

The libfabric core then wraps this provider pointer in a lightweight descriptor called fid_fabric, which acts as the application-visible handle for the fabric object. This descriptor contains metadata and a reference to the provider-managed object, allowing the core to track and route subsequent API calls correctly without duplicating the object. The core stores the fid_fabric handle in its internal tables, enabling fast lookup and validation whenever the application references the fabric in later calls.
Finally, the libfabric core returns the fid_fabric handle to the application. From the application’s perspective, this handle uniquely identifies the fabric object, while internally the provider maintains the persistent state and hardware mappings.


No comments:

Post a Comment