-
Zijing Liu authored
[Disagg][Perf] Use CUDA event sync instead of blocking `tolist` to avoid unintentional copy ops blocking across different CUDA streams, improving disagg TTIT/TTFT (#22760) Signed-off-by:
Zijing Liu <liuzijing2014@gmail.com> Signed-off-by:
Zijing Liu <liuzijing2014@users.noreply.github.com>
b395b3b0