For more details on receiving and routing based on the worker's published KV
metrics, see the [KV Cache Routing Guide](../docs/kv_cache_routing.md).
### Disaggregated Serving
#### NIXL
NIXL (NVIDIA Inter-process Link) enables efficient GPU memory sharing between processes. In Prefill/Decode disaggregation, we use NIXL to transfer computed KV cache blocks from prefill workers to decode workers. Here are the core concepts: