1. 29 Jul, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Add CUDA Weighted Neighborhood Sampling (#4064) · 86c81b4e
      Xin Yao authored
      
      
      * add weighted sampling without replacement (A-Chao)
      
      * improve Algorithm A-Chao with block-wise prefix sum
      
      * correctly fill out_idxs
      
      * implement weighted sampling with replacement
      
      * small fix
      
      * merge host-side code of weighted/uniform sampling
      
      * enable unit tests for cuda weighted sampling
      
      * move thrust/cub wrapper to the cmake file
      
      * update docs accordingly
      
      * fix linting
      
      * fix linting
      
      * fix unit test
      
      * Bump external CUB/Thrust versions
      
      * Fix code style and update description of algorithm design
      
      * [Feature] GPU support weighted graph neighbor sampling
      commit by pengqirong(OPPO)
      
      * merge pengqirong's implementation
      
      * revert the change to cub and thrust
      
      * fix linting
      
      * use DeviceSegmentedSort for better performance
      
      * add more comments
      
      * add necessary notes
      
      * add necessary notes
      
      * resolve some comments
      
      * define THRUST_CUB_WRAPPED_NAMESPACE
      
      * fix doc
      Co-authored-by: default avatar彭齐荣 <657017034@qq.com>
      86c81b4e
  2. 07 Jul, 2022 1 commit
  3. 28 Jun, 2022 1 commit
  4. 27 Jun, 2022 1 commit
    • Rhett Ying's avatar
      [Dist] enable USE_EPOLL in default (#4167) · 9d425315
      Rhett Ying authored
      * [Dist] enable USE_EPOLL in default
      
      * fix build issue on windows
      
      * fix build issue on windows
      
      * fix build issue on windows
      
      * fix build issue on windows
      
      * fix build issue on windows
      
      * fix build issue
      9d425315
  5. 08 Jun, 2022 1 commit
  6. 11 May, 2022 1 commit
    • Vikram Sharma's avatar
      Make USE_AVX flag default value OFF (#3983) · 1a6806e2
      Vikram Sharma authored
      
      
      With the emergence of new ISA (like ARM and RISCV) retaining USE_AVX ON default makes the default build instructions fail. Fundamentally DGL does not require the use of AVX for functional working. AVX is mainly needed when to enable optimization. So proposal is to default turn it off and then later during build instructions, folks with AVX capabilities can enable with 
      `cmake .. -DUSE_AVX=ON`
      Co-authored-by: default avatarZihao Ye <expye@outlook.com>
      1a6806e2
  7. 07 Feb, 2022 1 commit
  8. 11 Jan, 2022 1 commit
  9. 06 Dec, 2021 1 commit
    • Jinjing Zhou's avatar
      [RPC] Use tensorpipe for rpc communication (#3335) · a3ce780d
      Jinjing Zhou authored
      * doesn't know whether works
      
      * add change
      
      * fix
      
      * fix
      
      * fix
      
      * remove
      
      * revert
      
      * lint
      
      * lint
      
      * fix
      
      * revert
      
      * lint
      
      * fix
      
      * only build rpc on linux
      
      * lint
      
      * lint
      
      * fix build on windows
      
      * fix windows
      
      * remove old test
      
      * fix cmake
      
      * Revert "remove old test"
      
      This reverts commit f1ea75c777c34cdc1f08c0589676ba6aee1feb29.
      
      * fix windows
      
      * fix
      
      * fix
      
      * fix indent
      
      * fix indent
      
      * address comment
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * lint
      
      * fix indent
      
      * fix lint
      
      * add introduction
      
      * fix
      
      * lint
      
      * lint
      
      * add more logs
      
      * fix
      
      * update xbyak for C++14 with gcc5
      
      * Remove channels
      
      * fix
      
      * add test script
      
      * fix
      
      * remove unused file
      
      * fix lint
      
      * add timeout
      a3ce780d
  10. 03 Dec, 2021 1 commit
  11. 02 Dec, 2021 1 commit
  12. 29 Nov, 2021 1 commit
  13. 14 Oct, 2021 1 commit
    • zexi yuan's avatar
      [Bugfix] three bugs related to using DGL as a subdirectory(third_party) of another project. (#3379) · 18863069
      zexi yuan authored
      * [Bugfix] fix a compile error for Debug-BuildType on Windows Platform
      
      When using CMakeLists.txt to build the "Debug" BuildType on the Windows Platform, it has three compile errors (C4716) in the file "dgl\src\runtime\shared_mem.cc":
      
      'dgl::runtime::SharedMemory::CreateNew': must return a value
      'dgl::runtime::SharedMemory::Open': must return a value
      'dgl::runtime::SharedMemory::Exist': must return a value
      
      * [Bugfix] cmake error "cannot find load file" when DGL as a sub_directory on Linux
      
      When using DGL as a subdirectory in a CMake Project, the "CMAKE_SOURCE_DIR" here will return the parent cmake scope dir, which is not a expected dir.
      Maybe it is better to use "CMAKE_CURRENT_SOURCE_DIR" to set "GKLIB_PATH".
      
      * [Bugfix] cmd cmake error when DGL as a subdirectory
      
      When DGL as a subdirectory of another project, the WORKING_DIRECTORY of "add_custom_command" will be incorrect at the line 255 of "CMakeLists.txt", such that making a cmake "setlocal" error.
      18863069
  14. 28 Sep, 2021 1 commit
  15. 06 Sep, 2021 1 commit
  16. 13 Jul, 2021 2 commits
    • Quan (Andy) Gan's avatar
      Remove march=native flag (#3134) · 7c3e1f94
      Quan (Andy) Gan authored
      7c3e1f94
    • sanchit-misra's avatar
      [CPU][Kernel] Single socket spmm (#3024) · fac75e16
      sanchit-misra authored
      
      
      * optimizations of spmm for CPU
      
      * Added names of contributors
      
      * Minor code cleanup
      
      * Moved the spmm optimization code to a new header file
      
      * Moved to DGL's logging method
      
      * removed duplicate code between SpMMSumCsr and SpMMCmpCsr
      
      * Changes made to follow Google coding style
      
      * Fixed lint errors in spmm.h
      
      * Fixed some lint errors from spmm_blocking_libxsmm.h
      
      * Fixed lint errors from spmm_blocking_libxsmm.h
      
      * Added comments to SpMMCreateLibxsmmKernel
      
      * to enable building of tests, and other cosmetic changes
      
      * disabling libxsmm on windows
      
      * Put a condition to avoid opt impl for FP64 as libxsmm does not have FP64 support yet
      
      * cosmetic changes and documentation
      
      * cosmetic changes
      
      * to pass lint tests
      
      * replaced multiple allocations for buffers of indices and edges with a single allocation
      Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
      fac75e16
  17. 27 Jun, 2021 1 commit
    • Jinjing Zhou's avatar
      [Build] Make nccl optional (#3056) · 9664cdff
      Jinjing Zhou authored
      * fix
      
      * remove nvidiasmi
      
      * fix
      
      * fix docs
      
      * fix
      
      * fix
      
      * 1
      
      * fix
      
      * remove
      
      * skip deprecated kernel
      
      * fix
      
      * Revert "skip deprecated kernel"
      
      This reverts commit c5ceb7f60dbbaf065b81cc3680757fd611d90ad3.
      
      * fix
      9664cdff
  18. 03 Jun, 2021 1 commit
  19. 25 May, 2021 1 commit
  20. 20 May, 2021 1 commit
    • nv-dlasalle's avatar
      [Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings... · ae8dbe6d
      nv-dlasalle authored
      
      [Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings and sparse gradients. (#2825)
      
      * Split NCCL wrapper from sparse optimizer and sparse embedding
      
      * Add more unit tests for single node nccl
      
      * Fix unit test for tf
      
      * Switch to device histogram
      
      * Fix histgram issues
      
      * Finish migration to histogram
      
      * Handle cases with zero send/recieve data
      
      * Start on partition object
      
      * Get compiling
      
      * Updates
      
      * Add unit tests
      
      * Switch to partition object
      
      * Fix linting issues
      
      * Rename partition file
      
      * Add python doc
      
      * Fix python assert and finish doxygen comments
      
      * Remove stubs for range based partition to satisfy pylint
      
      * Wrap unit test in GPU only
      
      * Wrap explicit cuda call in ifdef
      
      * Merge with partition.py
      
      * update docstrings
      
      * Cleanup partition_op
      
      * Add Workspace object
      
      * Switch to using workspace object
      
      * Move last remainder based function out of nccl_api
      
      * Add error messages
      
      * Update docs with examples
      
      * Fix linting erros
      Co-authored-by: default avatarxiang song(charlie.song) <classicxsong@gmail.com>
      ae8dbe6d
  21. 09 Apr, 2021 1 commit
  22. 24 Mar, 2021 1 commit
    • Quan (Andy) Gan's avatar
      [Feature] Sparse-sparse matrix multiplication, addition, and masking (#2753) · 929d8634
      Quan (Andy) Gan authored
      * test
      
      * more stuff
      
      * add test
      
      * fixes
      
      * optimize algo
      
      * replace unordered_map with arrays
      
      * lint
      
      * lint x2
      
      * oops
      
      * disable gpu csrmm tests
      
      * remove gpu invocation
      
      * optimize with openmp
      
      * remove python functions
      
      * add back with docstrings
      
      * lint
      
      * lint
      
      * update python interface
      
      * functionize
      
      * functionize
      
      * lint
      
      * lint
      929d8634
  23. 28 Jan, 2021 1 commit
  24. 31 Dec, 2020 1 commit
  25. 25 Dec, 2020 1 commit
    • Quan (Andy) Gan's avatar
      [Performance] Use allocator from PyTorch if possible (#2328) · 9a7235fa
      Quan (Andy) Gan authored
      * first commit
      
      * some thoughts
      
      * move around
      
      * more commit
      
      * more fixes
      
      * now it uses torch allocator
      
      * fix symbol export error
      
      * fix
      
      * fixes
      
      * test fix
      
      * add script
      
      * building separate library per version
      
      * fix for vs2019
      
      * more fixes
      
      * fix on windows build
      
      * update jenkinsfile
      
      * auto copy built dlls for windows
      
      * lint and installation guide update
      
      * fix
      
      * specify conda environment
      
      * set environment for ci
      
      * fix
      
      * fix
      
      * fix
      
      * fix again
      
      * revert
      
      * fix cmake
      
      * fix
      
      * switch to using python interpreter path
      
      * remove scripts
      
      * debug
      
      * oops sorry
      
      * Update index.rst
      
      * Update index.rst
      
      * copies automatically, no need for this
      
      * do not print message if library not found
      
      * tiny fixes
      
      * debug on nightly
      
      * replace add_compile_definitions to make CMake 3.5 happy
      
      * fix linking to wrong lib for multiple pytorch envs
      
      * changed building strategy
      
      * fix nightly
      
      * fix windows
      
      * fix windows again
      
      * setup bugfix
      
      * address comments
      
      * change README
      9a7235fa
  26. 21 Dec, 2020 1 commit
  27. 17 Dec, 2020 1 commit
  28. 27 Nov, 2020 1 commit
  29. 17 Nov, 2020 1 commit
  30. 14 Nov, 2020 1 commit
  31. 13 Nov, 2020 1 commit
  32. 07 Nov, 2020 1 commit
  33. 30 Oct, 2020 1 commit
    • nv-dlasalle's avatar
      [Dataloading] Add class for copying tensors to/from the GPU on a non-default stream (#2284) · f673fc25
      nv-dlasalle authored
      * Add async transferer class
      
      * Add async ndarray copy interface
      
      * Add python bindings
      
      * Fix comment
      
      * Add python class
      
      * Fix linting issues
      
      * Add python unit test
      
      * Update python interface
      
      * move async_transferer to cuda only directory
      
      * Fix linting issue
      
      * Move out of contrib
      
      * Add doc strings
      
      * Move test compute from backend
      
      * Update comment
      
      * Fix test naming
      
      * Fix argument usage
      
      * Wrap/unwrap backend parameters
      
      * Move to dataloading
      
      * Move to 'dataloading'
      
      * Make GPU/CPU compatible
      
      * Fix unit tests
      
      * Add docs
      
      * Use only backend interface for datamovement in unit test
      f673fc25
  34. 26 Aug, 2020 1 commit
  35. 10 Aug, 2020 1 commit
    • Da Zheng's avatar
      Fix the performance issue of graph partitioning in new DGLGraph (#1934) · 729ff2ef
      Da Zheng authored
      
      
      * fix perf.
      
      * fix.
      
      * accelerate metis.
      
      * fix lint.
      
      * use gklib.
      
      * fix perf.
      
      * fix.
      
      * update metis.
      
      * update launch script
      
      * handle synchronized API.
      
      * fix.
      
      * fix example.
      
      * fix dataloader.
      
      * temp fix.
      
      * temp fix omp.
      
      * distinguish roles.
      
      * initialize iterator of DistDataloader correctly.
      
      * check the correctness of launch script.
      
      * move feature copy to sampler.
      
      * measure mem/network copy time.
      
      * remove
      
      * Revert "measure mem/network copy time."
      
      This reverts commit 86cefdc14b7815fcf5aad6496af912dba48e4aa6.
      
      * fix.
      
      * fix
      
      * fix.
      
      * fix cmake.
      
      * disable metis in windows.
      
      * disable metis tests in windows.
      
      * remove test for multigraph.
      
      * fix test.
      
      * fix.
      
      * fix cmake.
      
      * fix.
      
      * revert.
      Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-19-115.us-west-2.compute.internal>
      Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-19-1.us-west-2.compute.internal>
      729ff2ef
  36. 09 Jul, 2020 1 commit
  37. 30 Jun, 2020 1 commit
  38. 28 Jun, 2020 1 commit
    • Minjie Wang's avatar
      [CUDA][Kernel] More CUDA kernels; Standardize the behavior for sorted COO/CSR (#1704) · 870da747
      Minjie Wang authored
      * add cub; array cumsum
      
      * CSRSliceRows
      
      * fix warning
      
      * operator << for ndarray; CSRSliceRows
      
      * add CSRIsSorted
      
      * add csr_sort
      
      * inplace coosort and outplace csrsort
      
      * WIP: coo is sorted
      
      * mv cuda_utils
      
      * add AllTrue utility
      
      * csr sort
      
      * coo sort
      
      * coo2csr for sorted coo arrays
      
      * CSRToCOO from sorted
      
      * pass tests for the new kernel changes
      
      * cannot use inplace sort
      
      * lint
      
      * try fix msvc error
      
      * Fix g.copy_to and g.asnumbits; ToBlock no longer uses CSC
      
      * stash
      
      * revert some hack
      
      * revert some changes
      
      * address comments
      
      * fix
      
      * fix to_block unittest
      
      * add todo note
      870da747
  39. 21 Jun, 2020 1 commit
    • Tong He's avatar
      [Op] Farthest Point Sampler in Cpp and CUDA (#1630) · 3d47693b
      Tong He authored
      * working framework without actual algorithm logic
      
      * rename
      
      * fix
      
      * fps passes compilation
      
      * correct algorithm
      
      * add cuda implementation
      
      * update random start
      
      * before refactor
      
      * pass compilation but cuda not working
      
      * working
      
      * code working, will add docstring
      
      * add mxnet support
      
      * update docstring
      
      * update doc and test
      
      * cpplint
      
      * cpcplint
      
      * pylint
      
      * temporary fix
      
      * fix for win64
      
      * fix unitetest
      
      * fix
      
      * fix
      
      * remove comment
      
      * move to geometry package
      
      * remove redundant include
      
      * add docstrings and comments
      
      * add proof
      
      * add validity check
      3d47693b