"...git@developer.sourcefind.cn:kecinstone/2024-pra-vllm.git" did not exist on "15f5632365a98fd43ea42e4948a995aa399e99b5"
-
vasunvidia authored
* DGRAD-RS overlap bug fix This PR fixes a bug in enabling DGRAD-RS overlap by adding the layer to the correct method list. Previously, the RS-DGRAD overlap layer was incorrectly added to pipeline method list even if ring_exchange method is specified in config. Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> * Bug fix for ring_exchange ReduceScatter ring_exchange RS uses main_stream for last GEMM chunk. But the send/recv streams wait for stream_compute during last chunk. Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> --------- Signed-off-by:
Vasudevan Rengasamy <vrengasamy@nvidia.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
ec49a52b