[Refactor] Replace third_party/nccl with PyTorch's NCCL backend (#4989)
* expose GeneratePermutation * add sparse_all_to_all_push * add sparse_all_to_all_pull * add unit test * handle world_size=1 * remove python nccl wrapper * remove the nccl dependency * use pinned memory to speedup D2H copy * fix lint * resolve comments * fix lint * fix ut * resolve comments
Showing
This diff is collapsed.
Please register or sign in to comment