1. 24 Jul, 2023 1 commit
  2. 17 Apr, 2023 1 commit
  3. 11 Apr, 2023 1 commit
  4. 08 Mar, 2023 1 commit
    • Xin Yao's avatar
      [Refactor] Replace third_party/nccl with PyTorch's NCCL backend (#4989) · 8d5d8962
      Xin Yao authored
      * expose GeneratePermutation
      
      * add sparse_all_to_all_push
      
      * add sparse_all_to_all_pull
      
      * add unit test
      
      * handle world_size=1
      
      * remove python nccl wrapper
      
      * remove the nccl dependency
      
      * use pinned memory to speedup D2H copy
      
      * fix lint
      
      * resolve comments
      
      * fix lint
      
      * fix ut
      
      * resolve comments
      8d5d8962
  5. 09 Dec, 2022 1 commit
  6. 22 Nov, 2022 1 commit
    • Muhammed Fatih BALIN's avatar
      [Feature] (La)yer-Neigh(bor) sampling implementation (#4668) · bf264d00
      Muhammed Fatih BALIN authored
      
      
      * adding LABOR sampling
      
      * add ladies and pladies samplers
      
      * fix compile error after rebase
      
      * add reference for ladies sampler
      
      * Improve ladies implementation.
      
      * weighted labor sampling initial implementation draft
      fix indentation and small bug in ladies script
      
      * importance_sampling currently doesn't work with weights
      
      * fix weighted importance sampling
      
      * move labor example into its own folder
      
      * lint fixes
      
      * Improve documentation
      
      * remove examples from the main PR
      
      * fix linting by not using c++17 features
      
      * fix documentation of labor_sampler.py
      
      * update documentation for labor.py
      
      * reformat the labor.py file with black
      
      * fix linting errors
      
      * replace exception use with if
      
      * fix typo in error comment
      
      * fixing win64 build for ci
      
      * fixing weighted implementation, works now.
      
      * fix bug in the weighted case and importance_sampling==0
      
      * address part of the reviews
      
      * remove unused code paths from cuda
      
      * remove unused code path from cpu side
      
      * remove extra features of labor making use of random seed.
      
      * fix exclude_edges bug
      
      * remove pcg and seed logic from cpu implementation, seed logic should still work for cuda.
      
      * minor style change
      
      * refactor CPU implementation, take out the importance_sampling probability computation into a function.
      
      * improve CUDAWorkspaceAllocator
      
      * refactor importance_sampling part out to a function
      
      * minor optimization
      
      * fix linting issue
      
      * Revert "remove pcg and seed logic from cpu implementation, seed logic should still work for cuda."
      
      This reverts commit c250e07ac6d7e13f57e79e8a2c2f098d777378c2.
      
      * Revert "remove extra features of labor making use of random seed."
      
      This reverts commit 7f99034353080308f4783f27d9a08bea343fb796.
      
      * fix the documentation
      
      * disable NIDs
      
      * improve the documentation in the code
      
      * use the stream argument in pcg32 instead of skipping ahead t times, can discard the use of hashmap now since it is faster this way.
      
      * fix linting issue
      
      * address another round of reviews
      
      * further optimize CPU LABOR sampling implementation
      
      * fix linting error
      
      * update the comment
      
      * reformat
      
      * rename and rephrase comment
      
      * fix formatting according to new linting specs
      
      * fix compile error due to renaming, fix linting.
      
      * lint
      
      * rename DGLHeteroGraph to DGLGraph to match master
      
      * replace other occurrences of DGLHeteroGraph to DGLGraph
      Co-authored-by: default avatarMuhammed Fatih BALIN <m.f.balin@gmail.com>
      Co-authored-by: default avatarKaan Sancak <kaansnck@gmail.com>
      Co-authored-by: default avatarQuan Gan <coin2028@hotmail.com>
      bf264d00
  7. 10 Nov, 2022 1 commit
  8. 07 Nov, 2022 2 commits
  9. 06 Nov, 2022 1 commit
  10. 04 Nov, 2022 1 commit
  11. 21 Sep, 2022 1 commit
  12. 19 Sep, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) · cded5b80
      Xin Yao authored
      * rename `DLContext` to `DGLContext`
      
      * rename `kDLGPU` to `kDLCUDA`
      
      * replace DLTensor with DGLArray
      
      * fix linting
      
      * Unify DGLType and DLDataType to DGLDataType
      
      * Fix FFI
      
      * rename DLDeviceType to DGLDeviceType
      
      * decouple dlpack from the core library
      
      * fix bug
      
      * fix lint
      
      * fix merge
      
      * fix build
      
      * address comments
      
      * rename dl_converter to dlpack_convert
      
      * remove redundant comments
      cded5b80
  13. 15 Sep, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Import PyTorch's CUDA stream management (#4503) · 9a00cf19
      Xin Yao authored
      * add set_stream
      
      * add .record_stream for NDArray and HeteroGraph
      
      * refactor dgl stream Python APIs
      
      * test record_stream
      
      * add unit test for record stream
      
      * use pytorch's stream
      
      * fix lint
      
      * fix cpu build
      
      * address comments
      
      * address comments
      
      * add record stream tests for dgl.graph
      
      * record frames and update dataloder
      
      * add docstring
      
      * update frame
      
      * add backend check for record_stream
      
      * remove CUDAThreadEntry::stream
      
      * record stream for newly created formats
      
      * fix bug
      
      * fix cpp test
      
      * fix None c_void_p to c_handle
      9a00cf19
  14. 06 Sep, 2022 1 commit
    • Chang Liu's avatar
      [Feature] Unify the cuda stream used in core library (#4480) · 1c9d2a03
      Chang Liu authored
      
      
      * Use an internal cuda stream for CopyDataFromTo
      
      * small fix white space
      
      * Fix to compile
      
      * Make stream optional in copydata for compile
      
      * fix lint issue
      
      * Update cub functions to use internal stream
      
      * Lint check
      
      * Update CopyTo/CopyFrom/CopyFromTo to use internal stream
      
      * Address comments
      
      * Fix backward CUDA stream
      
      * Avoid overloading CopyFromTo()
      
      * Minor comment update
      
      * Overload copydatafromto in cuda device api
      Co-authored-by: default avatarxiny <xiny@nvidia.com>
      1c9d2a03
  15. 31 Aug, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Make TensorAdapter Stream Aware (#4472) · 2b766740
      Xin Yao authored
      * Allocate tensors in DGL's current stream
      
      * make tensoradaptor stream-aware
      
      * replace TAemtpy with cpu allocator
      
      * fix typo
      
      * try fix cpu allocation
      
      * clean header
      
      * redirect AllocDataSpace as well
      
      * resolve comments
      2b766740
  16. 15 Aug, 2022 1 commit
  17. 09 Jul, 2022 1 commit
  18. 07 Jul, 2022 1 commit
  19. 29 Jun, 2022 1 commit
  20. 11 Jun, 2022 1 commit
  21. 06 Jun, 2022 1 commit
  22. 12 May, 2022 1 commit
  23. 21 Feb, 2022 1 commit
    • Quan (Andy) Gan's avatar
      [Bugfix] Bug fixes in new dataloader (#3727) · 3f138eba
      Quan (Andy) Gan authored
      
      
      * fixes
      
      * fix
      
      * more fixes
      
      * update
      
      * oops
      
      * lint?
      
      * temporarily revert - will fix in another PR
      
      * more fixes
      
      * skipping mxnet test
      
      * address comments
      
      * fix DDP
      
      * fix edge dataloader exclusion problems
      
      * stupid bug
      
      * fix
      
      * use_uvm option
      
      * fix
      
      * fixes
      
      * fixes
      
      * fixes
      
      * fixes
      
      * add evaluation for cluster gcn and ddp
      
      * stupid bug again
      
      * fixes
      
      * move sanity checks to only support DGLGraphs
      
      * pytorch lightning compatibility fixes
      
      * remove
      
      * poke
      
      * more fixes
      
      * fix
      
      * fix
      
      * disable test
      
      * docstrings
      
      * why is it getting a memory leak?
      
      * fix
      
      * update
      
      * updates and temporarily disable forkingpickler
      
      * update
      
      * fix?
      
      * fix?
      
      * oops
      
      * oops
      
      * fix
      
      * lint
      
      * huh
      
      * uh
      
      * update
      
      * fix
      
      * made it memory efficient
      
      * refine exclude interface
      
      * fix tutorial
      
      * fix tutorial
      
      * fix graph duplication in CPU dataloader workers
      
      * lint
      
      * lint
      
      * Revert "lint"
      
      This reverts commit 805484dd553695111b5fb37f2125214a6b7276e9.
      
      * Revert "lint"
      
      This reverts commit 0bce411b2b415c2ab770343949404498436dc8b2.
      
      * Revert "fix graph duplication in CPU dataloader workers"
      
      This reverts commit 9e3a8cf34c175d3093c773f6bb023b155f2bd27f.
      Co-authored-by: default avatarxiny <xiny@nvidia.com>
      Co-authored-by: default avatarJinjing Zhou <VoVAllen@users.noreply.github.com>
      3f138eba
  24. 21 Jan, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Pin dgl.graph to the page-locked memory (#3616) · 40b44a43
      Xin Yao authored
      
      
      * implement pin_memory/unpin_memory/is_pinned for dgl.graph
      
      * update python docstring
      
      * update c++ docstring
      
      * add test
      
      * fix the broken UnifiedTensor
      
      * eliminate extra context parameter for pin/unpin
      
      * fix linting
      
      * fix typo
      
      * disable new format materialization for pinned graphs
      
      * update python doc for pin_memory_
      
      * fix unit test
      
      * update doc
      
      * change unitgraph and heterograph's PinMemory to in-place
      
      * update comments for NDArray's PinMemory_ and PinData
      
      * update doc
      Co-authored-by: default avatarJinjing Zhou <VoVAllen@users.noreply.github.com>
      40b44a43
  25. 18 Oct, 2021 1 commit
  26. 15 Oct, 2021 1 commit
  27. 29 Sep, 2021 1 commit
  28. 06 Sep, 2021 1 commit
  29. 19 Aug, 2021 1 commit
  30. 16 Jul, 2021 1 commit
  31. 27 Jun, 2021 1 commit
    • Jinjing Zhou's avatar
      [Build] Make nccl optional (#3056) · 9664cdff
      Jinjing Zhou authored
      * fix
      
      * remove nvidiasmi
      
      * fix
      
      * fix docs
      
      * fix
      
      * fix
      
      * 1
      
      * fix
      
      * remove
      
      * skip deprecated kernel
      
      * fix
      
      * Revert "skip deprecated kernel"
      
      This reverts commit c5ceb7f60dbbaf065b81cc3680757fd611d90ad3.
      
      * fix
      9664cdff
  32. 23 Jun, 2021 1 commit
  33. 11 Jun, 2021 1 commit
    • nv-dlasalle's avatar
      [Feature] Allow using NCCL for communication in dgl.NodeEmbedding and dgl.SparseOptimizer (#2824) · 17d604b5
      nv-dlasalle authored
      
      
      * Split from NCCL PR
      
      * Fix type in comment
      
      * Expand documentation for sparse_all_to_all_push
      
      * Restore previous behavior in example
      
      * Re-work optimizer to use NCCL based on gradient location
      
      * Allow for running with embedding on CPU but using NCCL for gradient exchange
      
      * Optimize single partition case
      
      * Fix pylint errors
      
      * Add missing include
      
      * fix gradient indexing
      
      * Fix line continuation
      
      * Migrate 'first_step'
      
      * Skip tests without enough GPUs to run NCCL
      
      * Improve empty tensor handling for pytorch 1.5
      
      * Fix indentation
      
      * Allow multiple NCCL communicator to coexist
      
      * Improve handling of empty message
      
      * Update python/dgl/nn/pytorch/sparse_emb.py
      Co-authored-by: default avatarxiang song(charlie.song) <classicxsong@gmail.com>
      
      * Update python/dgl/nn/pytorch/sparse_emb.py
      Co-authored-by: default avatarxiang song(charlie.song) <classicxsong@gmail.com>
      
      * Keepy empty tensor dimensionaless
      
      * th.empty -> th.tensor
      
      * Preserve shape for empty non-zero dimension tensors
      
      * Use shared state, when embedding is shared
      
      * Add support for gathering an embedding
      
      * Fix typo
      
      * Fix more typos
      
      * Fix backend call
      
      * Use NodeDataLoader to take advantage of ddp
      
      * Update training script to share memory
      
      * Only squeeze last dimension
      
      * Better handle empty message
      
      * Keep embedding on the target device GPU if dgl_sparse if false in RGCN example
      
      * Fix typo in comment
      
      * Add asserts
      
      * Improve documentation in example
      Co-authored-by: default avatarxiang song(charlie.song) <classicxsong@gmail.com>
      17d604b5
  34. 20 May, 2021 1 commit
    • nv-dlasalle's avatar
      [Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings... · ae8dbe6d
      nv-dlasalle authored
      
      [Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings and sparse gradients. (#2825)
      
      * Split NCCL wrapper from sparse optimizer and sparse embedding
      
      * Add more unit tests for single node nccl
      
      * Fix unit test for tf
      
      * Switch to device histogram
      
      * Fix histgram issues
      
      * Finish migration to histogram
      
      * Handle cases with zero send/recieve data
      
      * Start on partition object
      
      * Get compiling
      
      * Updates
      
      * Add unit tests
      
      * Switch to partition object
      
      * Fix linting issues
      
      * Rename partition file
      
      * Add python doc
      
      * Fix python assert and finish doxygen comments
      
      * Remove stubs for range based partition to satisfy pylint
      
      * Wrap unit test in GPU only
      
      * Wrap explicit cuda call in ifdef
      
      * Merge with partition.py
      
      * update docstrings
      
      * Cleanup partition_op
      
      * Add Workspace object
      
      * Switch to using workspace object
      
      * Move last remainder based function out of nccl_api
      
      * Add error messages
      
      * Update docs with examples
      
      * Fix linting erros
      Co-authored-by: default avatarxiang song(charlie.song) <classicxsong@gmail.com>
      ae8dbe6d
  35. 22 Mar, 2021 1 commit
  36. 09 Mar, 2021 1 commit
  37. 08 Feb, 2021 1 commit
    • nv-dlasalle's avatar
      [Sampling] Implement `dgl.to_block()` for the GPU (#2339) · bc3a532f
      nv-dlasalle authored
      
      
      * Add start of to_block gpu implementation
      
      * Pull in more changes from 0.4.2 cuda_to_block
      
      * Move more code to IdArray
      
      * Refactor DeviceNodeMapMaker
      
      * Updates
      
      * get compiling
      
      * Integrate to_block
      
      * Fix ID allocation
      
      * Minor fixes
      
      * Cleanup cuda calls to use cuda_common
      
      * Reduce kernel calls
      
      * Lint cleanup
      
      * Expand documentation
      
      * Remove unused function
      
      * Rename variables for consistency
      
      * Add doxygen comments
      
      * Fix file extension
      
      * Remove raw asynccopy for deviceapi
      
      * Remove unused function
      
      * Fix block/tile configuration
      
      * Add cuda_device_common.cuh
      
      * Add basic hashtable
      
      * Migrate part of hashtable
      
      * Refactor to use external hashtable
      
      * Make functions members
      
      * Format hash table functions
      
      * Migrate duplicate filling
      
      * Move last function over
      
      * Refactor with cu file
      
      * lint c++ code
      
      * Move context check to C++ code
      
      * Use macro switch
      
      * Add missing files
      
      * Update docstring
      
      * update docs
      
      * Move atomic functions
      
      * Refactor hashtable
      
      * Fix linting
      
      * Expand docs
      
      * Fix mismatched argument names
      
      * Switch doxygen comments from using @param to \param
      Co-authored-by: default avatarJinjing Zhou <VoVAllen@users.noreply.github.com>
      Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
      bc3a532f
  38. 29 Jan, 2021 1 commit
  39. 28 Jan, 2021 1 commit