PipeWire 1.4.7
|
PipeWire supports sharing Direct Memory Access buffers (DMA-BUFs) between clients via the SPA_DATA_DmaBuf data type. However properly negotiating DMA-BUF support on both the producer and the consumer side require following a specific procedure. This page describes said procedure by using events and methods from the filter or stream API.
Note: This article focuses mostly on DMA-BUF sharing from arbitrary devices, like discrete GPUs. For using DMA-BUFs created by v4l2 please refer to the corresponding paragraph.
The capability negotiation for DMA-BUFs is complicated by the fact that a usable and preferred optimal modifier for a given format can only be determined by the allocator. This allocator has to be invoked with the intersection of all supported modifiers for every client. As a result, the fixation of the modifier is delegated from PipeWire to the node responsible for allocating the buffers.
The stream parameters should contain two SPA_PARAM_EnumFormat objects for each format: one for DMA-BUFs, one for shared memory buffers as a fallback.
Query the list of all supported modifiers from your graphics API of choice. Add a SPA_FORMAT_VIDEO_modifier property to the first stream parameter with the flags SPA_POD_PROP_FLAG_MANDATORY | SPA_POD_PROP_FLAG_DONT_FIXATE. The value of the property should be set to a SPA_CHOICE_Enum containing one long choice per supported modifier, plus DRM_FORMAT_MOD_INVALID if the graphics API supports modifier-less buffers.
Note: When a producer is only supporting modifier-less buffers it can omit the SPA_POD_PROP_FLAG_DONT_FIXATE (see param_changed hook, For producers).
The second stream parameter should not contain any SPA_FORMAT_VIDEO_modifier property.
To prioritise DMA-BUFs place those SPA_PARAM_EnumFormat containing modifiers first, when emitting them to PipeWire.
When the param_changed hook is called for a SPA_PARAM_Format the client has to parse the spa_pod directly. Use spa_pod_find_prop(param, NULL, SPA_FORMAT_VIDEO_modifier) to check whether modifiers were negotiated. If they were negotiated, set the SPA_PARAM_BUFFERS_dataType property to 1 << SPA_DATA_DmaBuf. If they were not negotiated, fall back to shared memory by setting the SPA_PARAM_BUFFERS_dataType property to 1 << SPA_DATA_MemFd, 1 << SPA_DATA_MemPtr, or both.
While consumers only have to parse the resulting SPA_PARAM_Format for any format related information, it's up to the producer to fixate onto a single format modifier pair. The producer is also responsible to check if all clients announce sufficient capabilities or fallback to shared memory buffers when possible.
Use spa_format_video_raw_parse to get the format and modifier.
Producers have to handle two cases when it comes to modifiers wrt. to the previous announced capabilities: Using only the modifier-less API, only the modifier-aware one, or supporting both.
Note: When test allocating a buffer, collect all possible modifiers, while omitting DRM_FORMAT_MOD_INVALID from the SPA_FORMAT_VIDEO_modifier property and pass them all to the graphics API. If the allocation fails and the list of possible modifiers contains DRM_FORMAT_MOD_INVALID, fall back to allocating without an explicit modifier if the graphics API allows it.
This is relevant for producers.
Allocate a DMA-BUF only using the negotiated format and modifier.
This is relevant for consumers.
Check the type of the dequeued buffer. If its SPA_DATA_MemFd or SPA_DATA_MemPtr use the fallback SHM import mechanism. If it's SPA_DATA_DmaBuf get the DMA-BUF FDs (the plane count is encoded in the n_datas variable of the spa_buffer struct) and import them with the graphics API. Note: that the n_datas might also contain extra fds for things like sync_timelime metadata, you need to take this into account when persing the planes.
Note: Some graphics APIs have separated functions for the modifier-less case (DRM_FORMAT_MOD_INVALID) or are omitting the modifier, since it might be used for error handling.
It's important to make sure all consumers of the PipeWire stream are prepared to deal with DMA-BUFs. Most DMA-BUFs cannot be treated like shared memory in general because of the following issues:
When importing a DMA-BUF with a proper graphics API the size of a single buffer plane is no relevant property since it will be derived by the driver from the other properties. Therefore consumers should ignore the field maxsize of a spa_data and the field size of a spa_chunk struct. Producers are allowed to set both to 0. In cases where mapping a single plane is required the size should be obtained locally via the filedescriptor.
SPA offers helper functions to parse and build a spa_pod object to/from the spa_video_info_* struct. The flags SPA_VIDEO_FLAG_MODIFIER and SPA_VIDEO_FLAG_MODIFIER_FIXATION_REQUIRED are used to indicate modifier usage with the format. SPA_VIDEO_FLAG_MODIFIER declares the parsed/provided spa_video_info_* struct contains valid modifier information. For legacy reasons spa_format_video_*_build will announce any modifier != 0 even when this flag is unused. SPA_VIDEO_FLAG_MODIFIER_FIXATION_REQUIRED is exclusive to the parse helpers and declares that the parsed spa_pod contains modifier information which needs to be fixated as described above. The list of available modifiers has to be parsed manually from the spa_pod object.
Another use case for streaming via DMA-BUFs are exporting a camera feed from v4l2 as DMA-BUFs. Those are located in the main memory where it is possible to mmap them. This should be done as follows: Neither producer nor consumer should announce a modifier, but both should include 1 << SPA_DATA_DmaBuf in the SPA_PARAM_BUFFERS_dataType property. It's the the responsibility of the producer while the add_buffer event to choose DMA-BUF as the used buffer type even though no modifier is present, if it can guarantee, that the used buffer is mmapable.
Note: For now v4l2 uses planar buffers without modifiers. This is the reason for this special case.
In addition to DMABUF, a set of synchronization primitives (a SyncObjTimeline) and associated metadata can be negotiated on the buffers.
The explicit sync step is performed after the Format has been negotiated.
You might first want to check that the drm render you are using is capable of explicit sync by checking support for DRM_CAP_SYNCOBJ and DRM_CAP_SYNCOBJ_TIMELINE before attempting to negotiate explicit sync.
Explicit sync requires two extra fds in the buffers and an extra SPA_META_SyncTimeline metadata structure.
The metadata structure will only be allocated when both sides support explicit sync. We can use this to make a fallback SPA_PARAM_Buffers so that we can support both explicit sync and a fallback to implicit sync.
So, first announce support for SPA_META_SyncTimeline by adding the SPA_TYPE_OBJECT_ParamMeta object to the stream:
Next make a SPA_PARAM_Buffers that depends on the negotiation of the SyncTimelime metadata:
Note the mandatory SPA_PARAM_BUFFERS_metaType with the SPA_META_SyncTimeline bit set. This forces this buffer layout to be used when SyncTimeline metadata was negotiated. Also note the SPA_PARAM_BUFFERS_blocks that is now set to the number of DMABUF planes + 2. In this case we have 1 plane/fd for the DMABUF and 2 fds for the SyncObjTimelines.
You can also add a fallback SPA_PARAM_Buffers when the SPA_META_SyncTimeline was not negotiated:
This one has just 1 data block with the DMABUF fd and plane info.
After sending the SPA_PARAM_Buffers, the buffer will be allocated by the PipeWire server.
In the pw-stream::add_buffer event, check if the SPA_META_SyncTimeline is available on the buffer:
When the metadata is available, the SyncObj fds are in the last 2 data planes of the buffer, the acquire and release syncobj respectively. You can keep a ref to the struct spa_meta_sync_timeline because we will need this later when processing the buffers.
If the producer is allocating buffers, when the stream has the PW_STREAM_FLAG_ALLOC_BUFFERS flag, it should allocate the DMABUF and syncobj now and place them in the buffer data. First the plane fds and then the 2 syncobj fds.
The consumer can directly use the fds. The SyncObj fds can be converted to a handle, for example, to make things easier later:
The struct spa_meta_sync_timeline contains 2 fields: the acquire_point and release_point.
Producers will start a render operation on the DMABUF of the buffer and place the acquire_point in the struct spa_meta_sync_timeline. When the rendering is complete, the producer should signal the acquire_point on the acquire SyncObjTimeline.
Producers will also add a release_point on the release SyncObjTimeline. They are only allowed to reuse the buffer when the release_point has been signaled.
Consumers use the acquire_point to wait for rendering to complete before processing the buffer. This can be offloaded to the hardware when submitting the rendering operation or it can be done explicitly with drmSyncobjTimelineWait() on the acquire SyncObjTimeline handle and the acquire_point of the metadata.
Consumers should then also signal the release_point on the release SyncObjTimeline when they complete processing the buffer. This can be done in the hardware as part of the render pipeline or explicitly with drmSyncobjTimelineSignal() on the release handle and the release_point of the metadata.