**Best Practice**: Use RDMA even for same-node communication. The overhead is minimal and it provides consistent behavior whether pods land on the same or different nodes.
**Best Practice**: Use RDMA even for same-node communication. The overhead is minimal and it provides consistent behavior whether pods land on the same or different nodes.
...
@@ -177,24 +110,9 @@ When prefill and decode workers are on the **same physical node**:
...
@@ -177,24 +110,9 @@ When prefill and decode workers are on the **same physical node**:
When prefill and decode workers are on **different nodes**:
When prefill and decode workers are on **different nodes**: