1. 08 Mar, 2023 1 commit
    • Xin Yao's avatar
      [Refactor] Replace third_party/nccl with PyTorch's NCCL backend (#4989) · 8d5d8962
      Xin Yao authored
      * expose GeneratePermutation
      
      * add sparse_all_to_all_push
      
      * add sparse_all_to_all_pull
      
      * add unit test
      
      * handle world_size=1
      
      * remove python nccl wrapper
      
      * remove the nccl dependency
      
      * use pinned memory to speedup D2H copy
      
      * fix lint
      
      * resolve comments
      
      * fix lint
      
      * fix ut
      
      * resolve comments
      8d5d8962
  2. 17 Feb, 2023 1 commit
  3. 10 Oct, 2022 1 commit
  4. 29 Jun, 2022 1 commit
  5. 20 Aug, 2021 1 commit
    • nv-dlasalle's avatar
      [Feature][DistDGL] Add NCCL support for range based partitions (#3213) · 7f927939
      nv-dlasalle authored
      * Implement range based NDArrayPartition
      
      * Finish implement range based partition support
      
      * Add unit test
      
      * Fix whitepace
      
      * Add Kernel suffix
      
      * Fix argument passing
      
      * Add doxygen docs and improve variable naming
      
      * Add unit test
      
      * Add function for converting a partition book
      
      * Add example to partition_op docs
      
      * Fix dtype conversion for mxnet and tensorflow
      7f927939
  6. 20 May, 2021 1 commit
    • nv-dlasalle's avatar
      [Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings... · ae8dbe6d
      nv-dlasalle authored
      
      [Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings and sparse gradients. (#2825)
      
      * Split NCCL wrapper from sparse optimizer and sparse embedding
      
      * Add more unit tests for single node nccl
      
      * Fix unit test for tf
      
      * Switch to device histogram
      
      * Fix histgram issues
      
      * Finish migration to histogram
      
      * Handle cases with zero send/recieve data
      
      * Start on partition object
      
      * Get compiling
      
      * Updates
      
      * Add unit tests
      
      * Switch to partition object
      
      * Fix linting issues
      
      * Rename partition file
      
      * Add python doc
      
      * Fix python assert and finish doxygen comments
      
      * Remove stubs for range based partition to satisfy pylint
      
      * Wrap unit test in GPU only
      
      * Wrap explicit cuda call in ifdef
      
      * Merge with partition.py
      
      * update docstrings
      
      * Cleanup partition_op
      
      * Add Workspace object
      
      * Switch to using workspace object
      
      * Move last remainder based function out of nccl_api
      
      * Add error messages
      
      * Update docs with examples
      
      * Fix linting erros
      Co-authored-by: default avatarxiang song(charlie.song) <classicxsong@gmail.com>
      ae8dbe6d