feat: Using NIXL for KV cache transfer when using disaggregated serving in TRTLLM (#1591)
Signed-off-by:Tanmay Verma <tanmay2592@gmail.com> Co-authored-by:
Ryan McCormick <rmccormick@nvidia.com>
Showing
Please register or sign in to comment