In the diagram below, Local creates a [`WritableOperation`](writable_operation.md) intended to receive data from Remote.
Local then sends metadata about the requested RDMA operation to Remote.
Remote then uses the metadata to create a [`WriteOperation`](write_operation.md) which will perform the GPU Direct RDMA memory transfer from Remote's GPU memory to Local's GPU memory.
In the case of the [Dynamo Multimodal Disaggregated Example](../../examples/multimodal/README.md):
1. The HTTP frontend accepts a text prompt and a URL to an image.
2. The prompt and URL are then enqueued with the Processor before being dispatched to the first available Decode Worker.
3. Decode Worker then requests a Prefill Worker to provide key-value data for the LLM powering the Decode Worker.
4. Prefill Worker then requests that the image be processed and provided as embeddings by the Encode Worker.
5. Encode Worker acquires the image, processes it, performs inference on the image using a specialized vision model, and finally provides the embeddings to Prefill Worker.
6. Prefill Worker receives the embeddings from Encode Worker and generates a key-value cache (KV$) update for Decode Worker's LLM and writes the update directly to the GPU memory reserved for the data.
7. Finally, Decode Worker performs the requested inference.
> In this example, it is the data transfer between the Prefill Worker and the Encode Worker that utilizes the Dynamo NIXL Connect library.
> The KV Cache transfer between Decode Worker and Prefill Worker utilizes the NIXL base RDMA subsystem directly without using the Dynamo NIXL Connect library.
#### Code Examples
See [prefill_worker](https://github.com/ai-dynamo/dynamo/tree/main/examples/multimodal/components/prefill_worker.py#L199) or [decode_worker](https://github.com/ai-dynamo/dynamo/tree/main/examples/multimodal/components/decode_worker.py#L239) from our Multimodal example,
for how they coordinate directly with the Encode Worker by creating a [`WritableOperation`](writable_operation.md),
sending the operation's metadata via Dynamo's round-robin dispatcher, and awaiting the operation for completion before making use of the transferred data.
See [encode_worker](https://github.com/ai-dynamo/dynamo/tree/main/examples/multimodal/components/encode_worker.py#L190) from our Multimodal example,
for how the resulting embeddings are registered with the RDMA subsystem by creating a [`Descriptor`](descriptor.md),
a [`WriteOperation`](write_operation.md) is created using the metadata provided by the requesting worker,
and the worker awaits for the data transfer to complete for yielding a response.
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# dynamo.nixl_connect.Connector
Core class for managing the connection between workers in a distributed environment.
Use this class to create readable and writable operations, or read and write data to remote workers.
This class is responsible for interfacing with the NIXL-based RDMA subsystem and providing a "Pythonic" interface
with which to utilize GPU Direct RDMA accelerated data transfers between models hosted by different workers in a Dynamo pipeline.
The connector provides two methods of moving data between workers:
- Preparing local memory to be written to by a remote worker.
- Preparing local memory to be read by a remote worker.
In both cases, local memory is registered with the NIXL-based RDMA subsystem via the [`Descriptor`](#descriptor) class and provided to the connector.
The connector then configures the RDMA subsystem to expose the memory for the requested operation and returns an operation control object.
The operation control object, either a [`ReadableOperation`](readable_operation.md) or a [`WritableOperation`](writable_operation.md),
provides RDMA metadata ([RdmaMetadata](rdma_metadata.md)) via its `.metadata()` method, functionality to query the operation's current state, as well as the ability to cancel the operation prior to its completion.
The RDMA metadata must be provided to the remote worker expected to complete the operation.
The metadata contains required information (identifiers, keys, etc.) which enables the remote worker to interact with the provided memory.
> [!Warning]
> RDMA metadata contains a worker's address as well as security keys to access specific registered memory descriptors.
> This data provides direct memory access between workers, and should be considered sensitive and therefore handled accordingly.
> See [`ReadOperation`](read_operation.md#example-usage), [`ReadableOperation`](readable_operation.md#example-usage),
> [`WritableOperation`](writable_operation.md#example-usage), and [`WriteOperation`](write_operation.md#example-usage)
> for additional examples.
## Methods
### `begin_read`
```python
asyncdefbegin_read(
self,
remote_metadata:RdmaMetadata,
local_descriptors:Descriptor|list[Descriptor],
)->ReadOperation:
```
Creates a [`ReadOperation`](read_operation.md) for transferring data from a remote worker.
To create the operation, the serialized request from a remote worker's [`ReadableOperation`](readable_operation.md)
along with a matching set of local memory descriptors which reference memory intended to receive data from the remote worker
must be provided.
The serialized request must be transferred from the remote to the local worker via a secondary channel, most likely HTTP or TCP+NATS.
Once created, data transfer will begin immediately.
Disposal of the object will instruct the RDMA subsystem to cancel the operation,
therefore the operation should be awaited until completed unless cancellation is intended.
Use [`.wait_for_completion()`](read_operation.md#wait_for_completion) to block the caller until the operation has completed or encountered an error.
### `begin_write`
```python
asyncdefbegin_write(
self,
local_descriptors:Descriptor|list[Descriptor],
remote_metadata:RdmaMetadata,
)->WriteOperation:
```
Creates a [`WriteOperation`](write_operation.md) for transferring data to a remote worker.
To create the operation, the serialized request from a remote worker's [`WritableOperation`](writable_operation.md)
along with a matching set of local memory descriptors which reference memory to be transferred to the remote worker
must be provided.
The serialized request must be transferred from the remote to the local worker via a secondary channel, most likely HTTP or TCP+NATS.
Once created, data transfer will begin immediately.
Disposal of the object will instruct the RDMA subsystem to cancel the operation,
therefore the operation should be awaited until completed unless cancellation is intended.
Use [`.wait_for_completion()`](write_operation.md#wait_for_completion) to block the caller until the operation has completed or encountered an error.
### `create_readable`
```python
defcreate_readable(
self,
local_descriptors:Descriptor|list[Descriptor],
)->ReadableOperation:
```
Creates a [`ReadableOperation`](readable_operation.md) for transferring data to a remote worker.
To create the operation, a set of local memory descriptors must be provided that reference memory intended to be transferred to a remote worker.
Once created, the memory referenced by the provided descriptors becomes immediately readable by a remote worker with the necessary metadata.
The metadata required to access the memory referenced by the provided descriptors is accessible via the operation's `.metadata()` method.
Once acquired, the metadata needs to be provided to a remote worker via a secondary channel, most likely HTTP or TCP+NATS.
Disposal of the object will instruct the RDMA subsystem to cancel the operation,
therefore the operation should be awaited until completed unless cancellation is intended.
Use [`.wait_for_completion()`](readable_operation.md#wait_for_completion) to block the caller until the operation has completed or encountered an error.
### `create_writable`
```python
defcreate_writable(
self,
local_descriptors:Descriptor|list[Descriptor],
)->WritableOperation:
```
Creates a [`WritableOperation`](writable_operation.md) for transferring data from a remote worker.
To create the operation, a set of local memory descriptors must be provided which reference memory intended to receive data from a remote worker.
Once created, the memory referenced by the provided descriptors becomes immediately writable by a remote worker with the necessary metadata.
The metadata required to access the memory referenced by the provided descriptors is accessible via the operation's `.metadata()` method.
Once acquired, the metadata needs to be provided to a remote worker via a secondary channel, most likely HTTP or TCP+NATS.
Disposal of the object will instruct the RDMA subsystem to cancel the operation,
therefore the operation should be awaited until completed unless cancellation is intended.
Use [`.wait_for_completion()`](writable_operation.md#wait_for_completion) to block the caller until the operation has completed or encountered an error.
## Properties
### `is_cuda_available`
```python
@cached_property
defis_cuda_available(self)->bool:
```
Gets `True` when CUDA is available for the selected array module (most likely CuPy); otherwise `False`.
### `name`
```python
@property
defname(self)->str|None:
```
Gets the Dynamo component name used by the connector.
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# dynamo.nixl_connect.RdmaMetadata
A Pydantic type intended to provide JSON serialized RDMA metadata about a [`ReadableOperation`](readable_operation.md) or [`WritableOperation`](writable_operation.md) object.
RDMA metadata contains detailed information about a worker process and how to access memory descriptors registered with it.
This data is required to perform data transfers using the NIXL based RDMA subsystem.
> [!Warning]
> RDMA metadata contains a worker's address as well as security keys to access specific registered memory descriptors.
> This data provides direct memory access between workers, and should be considered sensitive and therefore handled accordingly.
Use the respective class's `.metadata()` method to generate an `RdmaMetadata` object for an operation.
> [!Tip]
> Classes using `RdmaMetadata` objects must be paired correctly.
> [`ReadableOperation`](readable_operation.md) with [`ReadOperation`](read_operation.md), and
> [`WritableOperation`](write_operation.md) with [`WriteOperation`](write_operation.md).
> Incorrect pairing will result in an error being raised.
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# dynamo.nixl_connect.ReadOperation
An operation which transfers data from a remote worker to the local worker.
To create the operation, RDMA metadata ([RdmaMetadata](rdma_metadata.md)) from a remote worker's [`ReadableOperation`](readable_operation.md)
along with a matching set of local [`Descriptor`](descriptor.md) objects which reference memory intended to receive data from the remote worker must be provided.
The RDMA metadata must be transferred from the remote to the local worker via a secondary channel, most likely HTTP or TCP+NATS.
Once created, data transfer will begin immediately.
Disposal of the object will instruct the RDMA subsystem to cancel the operation,
therefore the operation should be awaited until completed unless cancellation is intended.
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# dynamo.nixl_connect.ReadableOperation
An operation which enables a remote worker to read data from the local worker.
To create the operation, a set of local [`Descriptor`](descriptor.md) objects must be provided that reference memory intended to be transferred to a remote worker.
Once created, the memory referenced by the provided descriptors becomes immediately readable by a remote worker with the necessary metadata.
The RDMA metadata ([RdmaMetadata](rdma_metadata.md)) required to access the memory referenced by the provided descriptors is accessible via the operations `.metadata()` method.
Once acquired, the metadata needs to be provided to a remote worker via a secondary channel, most likely HTTP or TCP+NATS.
Disposal of the object will instruct the RDMA subsystem to cancel the operation,
therefore the operation should be awaited until completed unless cancellation is intended.
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# dynamo.nixl_connect.WritableOperation
An operation which enables a remote worker to write data to the local worker.
To create the operation, a set of local [`Descriptor`](descriptor.md) objects must be provided which reference memory intended to receive data from a remote worker.
Once created, the memory referenced by the provided descriptors becomes immediately writable by a remote worker with the necessary metadata.
The RDMA metadata ([RdmaMetadata](rdma_metadata.md)) required to access the memory referenced by the provided descriptors is accessible via the operations `.metadata()` method.
Once acquired, the metadata needs to be provided to a remote worker via a secondary channel, most likely HTTP or TCP+NATS.
Disposal of the object will instruct the RDMA subsystem to cancel the operation,
therefore the operation should be awaited until completed unless cancellation is intended.
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# dynamo.nixl_connect.WriteOperation
An operation which transfers data from the local worker to a remote worker.
To create the operation, RDMA metadata ([RdmaMetadata](rdma_metadata.md)) from a remote worker's [`WritableOperation`](writable_operation.md)
along with a matching set of local [`Descriptor`](descriptor.md) objects which reference memory to be transferred to the remote worker must be provided.
The RDMA metadata must be transferred from the remote to the local worker via a secondary channel, most likely HTTP or TCP+NATS.
Once created, data transfer will begin immediately.
Disposal of the object will instruct the RDMA subsystem to cancel the operation,
therefore the operation should be awaited until completed unless cancellation is intended.
f"Descriptor length mismatch: `local_descriptors` and `remote_descriptors` descriptor at {i} must have the same size. {local_descriptors[i].size} != {remote_descriptors[i].size}."