Event Queue Creation (fi_eq_open)
Phase 1: Application – Request & Definition
The purpose of this phase is to specify the type, size, and capabilities of the Event Queue (EQ) your application needs. Event queues are used to report events associated with control operations. They can be linked to memory registration, address vectors, connection management, and fabric- or domain-level events. Reported events are either associated with a requested operation or affiliated with a call that registers for specific types of events, such as listening for connection requests. By preparing a struct fi_eq_attr, the application describes exactly what it needs so the provider can allocate the EQ properly.
In addition to basic properties like .size (number of events the queue can hold) and .wait_obj (how the application waits for events), the .flags field can request specific EQ capabilities. Common flags include:
- FI_WRITE: Requests support for user-inserted events via fi_eq_write(). If this flag is set, the provider must allow the application to invoke fi_eq_write().
- FI_REMOTE_WRITE: Requests support for remote write completions being reported to this EQ.
- FI_RMA: Requests support for Remote Memory Access events (e.g., RMA completions) to be delivered to this EQ.
Flags are encoded as a bitmask, so multiple capabilities can be requested simultaneously using bitwise OR.
Example API Call:
struct fi_eq_attr eq_attr = {
.size = 1024,
.wait_obj = FI_WAIT_FD,
.flags = FI_WRITE | FI_REMOTE_WRITE | FI_RMA,
};
struct fid_eq *eq;
int ret = fi_eq_open(domain, &eq_attr, &eq, NULL);
Explanation of the fields in this example:
.size = 1024: The EQ can hold up to 1024 events. This defines the queue depth, i.e., how many events can be buffered by the provider before the application consumes them.
.wait_obj = FI_WAIT_FD: Specifies the mechanism the application will use to wait for events. FI_WAIT_FD means the EQ provides a file descriptor that the application can poll or select on, integrating event waiting into standard OS I/O mechanisms. Other options include FI_WAIT_NONE for busy polling or FI_WAIT_SET to attach the EQ to a wait set.
.flags = FI_WRITE | FI_REMOTE_WRITE | FI_RMA:This field is a bitmask specifying which types of events the application expects the Event Queue to support. FI_WRITE allows user-inserted events, FI_REMOTE_WRITE requests notifications for remote write completions, and FI_RMA requests notifications for RMA operations. The provider checks these flags against the capabilities of the parent domain (fid_domain) to ensure they are supported. If a requested capability is not available, fi_eq_open() will fail. In this example, instead of using a bitmask, descriptive capability names are shown for clarity.
Phase 2: Provider – Validation & Limits Check
The purpose of this phase is to ensure that the requested EQ can be supported by the provider. The provider validates the fi_eq_attr structure against its capabilities in the fi_info structure returned during discovery. Specifically, the .flags bitmask is checked against fi_info->caps, and each requested capability (FI_WRITE, FI_REMOTE_WRITE, FI_RMA) must be supported. Other checks include domain constraints, such as maximum number of EQs per domain and maximum queue depth. If any requested flag or attribute exceeds provider limits, the call fails.
Phase 3: Provider – Creation & Handle Return
The purpose of this phase is to allocate memory and internal structures for the EQ and return a usable handle to the application. The provider creates the fid_eq object in RAM, associates it with the parent domain (fid_domain), and returns the handle. The EQ is now ready to be bound to endpoints and used for event reporting. The completion queue (CQ) is used to track the results of data transfer operations. Every communication request eventually produces a completion, which is placed into the CQ once processed.
Example fid_eq (Event Queue) – Illustrative
fid_eq {
fid_type : FI_EQ
fid : 0xF1DE601
parent_fid : 0xF1DD01
provider : "libfabric-uet"
caps :FI_WRITE | FI_REMOTE_WRITE | FI_RMA
size : 1024
wait_obj : FI_WAIT_FD
provider_data : <pointer to provider EQ struct>
ref_count : 1
context : <app-provided void *>
}
Object Example 4-3: Event Queue (EQ).
Explanation of fields:
fid_type: Type of object, here EQ.
fid: Unique handle for the EQ object.
parent_fid: Pointer to the domain it belongs to.
Caps: Bitmask of requested/available capabilities: user events, remote write completions, RMA completions.
Size: Queue depth (number of events).
wait_obj: Wait mechanism used by application.
provider_data: Pointer to provider-internal EQ structure.
ref_count: Tracks object references for lifecycle management.
Context: Application-provided context pointer.
No comments:
Post a Comment