Phase 1: Application (Discovery & choice)
After the UET provider populated fi_info structures for each NIC/port combination during discovery, the application can begin the object creation process. It first consults the in-memory fi_info list to identify the entry that best matches its requirements. Each fi_info contains nested attribute structures describing fabric, domain, and endpoint capabilities, including fi_fabric_attr (fabric name, provider identifier, version information), fi_domain_attr (memory registration mode, key details, domain capabilities), and fi_ep_attr (endpoint type, reliable versus unreliable semantics, size limits, and supported capabilities). The application examines the returned entries and selects the fi_info that satisfies its needs (for example: provider == "uet", fabric name == "UET", required capabilities, reliable transport, or a specific memory registration mode). The chosen fi_info then provides the attributes — effectively serving as hints — that the application passes into subsequent creation calls such as fi_fabric(), fi_domain(), and fi_endpoint(). Each fi_info acts as a self-contained “capability snapshot,” describing one possible combination of NIC, port, and transport mode.
Phase 2: Libfabric Core (dispatch & wiring)
When the application calls fi_fabric(), the core forwards this request to the corresponding provider’s fabric entry point. In this way, the fi_info produced during discovery effectively becomes the configuration input for object creation.
The core’s role is intentionally lightweight: it matches the application’s selected fi_info to the appropriate provider implementation and invokes the provider callback for fabrics, domains, and endpoints. Throughout this process, the fi_info acts as a context carrier, containing the provider identifier, fabric name, and attribute templates for domains and endpoints. The core passes these attributes directly to the provider during creation, ensuring that the provider has all the information necessary to map the requested objects to the correct NIC and transport configuration.
Phase 3: UET Provider
When the application invokes fi_fabric(), passing the fi_fabric_attr obtained from the chosen fi_info, the call is routed to the UET provider’s uet_fabric() entry point for fabric creation. The provider treats the attributes contained within the chosen fi_info as the authoritative configuration for fabric creation. Because each fi_fabric_attr originates from a NIC-specific fi_info, the provider immediately knows which physical NIC and port are associated with the requested fabric object.
The provider uses the fi_fabric_attr to determine which interfaces belong to the requested fabric, the provider-specific capabilities that must be supported, and any optional flags supplied by the application. It then allocates and initializes internal data structures to represent the fabric object, mapping it to the underlying NIC and driver resources described by the discovery snapshot.
During creation, the provider validates that the requested fabric can be supported by the current hardware and driver state. If the NIC or configuration has changed since discovery—for example, if the NIC is unavailable or the requested capabilities are no longer supported—the provider returns an error, preventing creation of an invalid or unsupported fabric object. Otherwise, the provider completes the fabric initialization, making it ready for subsequent domain and endpoint creation calls.
By relying exclusively on the fi_info snapshot from discovery, the provider ensures that the fabric object is created consistently and deterministically, reflecting the capabilities and constraints reported to the application during the discovery phase.
Phase 4: Libfabric Core (Fabric Object Publication)
Once the UET provider successfully creates the fabric object, the libfabric core generates the corresponding fid_fabric handle, which serves as the application-visible representation of the fabric. The term FID stands for Fabric Identifier, a unique identifier assigned to each libfabric object. All subsequent libfabric objects created within the context of this fabric — including domains, endpoints, and memory regions — are prefixed with this FID to maintain object hierarchy and enable internal tracking.
The fid_fabric structure contains metadata about the fabric, such as the associated provider, the fabric name, and internal pointers to provider-specific data structures. It acts as a lightweight descriptor for the application, while the actual resources and state remain managed by the provider.
The libfabric core stores the fid_fabric in in-memory structures in RAM, typically within internal libfabric tables that track all active fabric objects. This allows the core to efficiently validate future API calls that reference the fabric, to maintain object hierarchies, and to route creation requests (e.g., for domains or endpoints) to the correct provider instance. Because the FID resides in RAM, operations using fid_fabric are fast and transient; the core relies on the provider to maintain the persistent state and hardware mappings associated with the fabric.
By publishing the fabric as a fid_fabric object, libfabric establishes a clear and consistent handle for the application. This handle allows the application to reference the fabric unambiguously in subsequent creation calls, while preserving the mapping between the abstract libfabric object and the underlying NIC resources managed by the provider.
Example fid_fabric Object
id_fabric {
fid_type : FI_FABRIC
fid : 0x0001ABCD
provider : "uet"
fabric_name : "UET-Fabric1"
version : 1.0
state : ACTIVE
ref_count : 1
provider_data : <pointer>
creation_flags : 0x0
timestamp : 1690000000
}
Explanation of fields:
fid_type: Identifies the type of object within libfabric and distinguishes fabric objects (FI_FABRIC) from other object types such as domains or endpoints. This allows both the libfabric core and the provider to validate and correctly handle API calls, object creation, and destruction. The fid field is a unique Fabric Identifier assigned by libfabric, serving as the primary identifier for the fabric object. All child objects, including domains, endpoints, and memory regions, inherit this FID as a prefix to maintain hierarchy and uniqueness, enabling the core to validate object references and route creation calls to the correct provider instance.
provider field: Indicates the name of the provider managing this fabric object, such as "uet". It associates the fabric with the underlying hardware implementation and ensures that all subsequent calls for child objects are dispatched to the correct provider. The fabric_name field contains the human-readable name of the fabric, selected during discovery or by the application, for example "UET-Fabric1". This name allows applications to identify the fabric among multiple available options and is used as a selection criterion during both discovery (fi_getinfo) and creation (fi_fabric).
version field: Specifies the provider or fabric specification version and ensures compatibility between the application and provider. It can be used for logging, debugging, or runtime checks to verify that the fabric supports the required feature set. The state field tracks the current lifecycle status of the fabric object, indicating whether it is active, ready for use, or destroyed. Both the libfabric core and provider validate this field before allowing any operations on the fabric.
ref_count: Maintains a reference counter for the object, preventing it from being destroyed while still in use by the application or other libfabric structures. This counter is incremented during creation or when child objects reference the fabric and decremented when objects are released or destroyed.
provider_data: Contains an internal pointer to provider-managed structures, including NIC mappings, hardware handles, and configuration details. This field is only accessed by the provider; the application interacts with the fabric through the fid_fabric handle.
creation_flags: Contains optional flags provided by the application during fi_fabric() creation, allowing customization of the fabric initialization process, such as enabling non-default modes or debug options. Finally, the timestamp field is an optional value indicating when the fabric was created. It is useful for debugging, logging, and performance tracking, helping to correlate fabric initialization with other libfabric objects and operations.
https://nwktimes.blogspot.com/2025/09/ultra-ethernet-resource-initialization.html
No comments:
Post a Comment