1. 29 Feb, 2024 1 commit
  2. 17 May, 2023 1 commit
  3. 09 Dec, 2022 1 commit
  4. 07 Nov, 2022 1 commit
  5. 06 Nov, 2022 1 commit
  6. 03 Nov, 2022 2 commits
  7. 28 Oct, 2022 1 commit
  8. 19 Sep, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) · cded5b80
      Xin Yao authored
      * rename `DLContext` to `DGLContext`
      
      * rename `kDLGPU` to `kDLCUDA`
      
      * replace DLTensor with DGLArray
      
      * fix linting
      
      * Unify DGLType and DLDataType to DGLDataType
      
      * Fix FFI
      
      * rename DLDeviceType to DGLDeviceType
      
      * decouple dlpack from the core library
      
      * fix bug
      
      * fix lint
      
      * fix merge
      
      * fix build
      
      * address comments
      
      * rename dl_converter to dlpack_convert
      
      * remove redundant comments
      cded5b80
  9. 15 Sep, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Import PyTorch's CUDA stream management (#4503) · 9a00cf19
      Xin Yao authored
      * add set_stream
      
      * add .record_stream for NDArray and HeteroGraph
      
      * refactor dgl stream Python APIs
      
      * test record_stream
      
      * add unit test for record stream
      
      * use pytorch's stream
      
      * fix lint
      
      * fix cpu build
      
      * address comments
      
      * address comments
      
      * add record stream tests for dgl.graph
      
      * record frames and update dataloder
      
      * add docstring
      
      * update frame
      
      * add backend check for record_stream
      
      * remove CUDAThreadEntry::stream
      
      * record stream for newly created formats
      
      * fix bug
      
      * fix cpp test
      
      * fix None c_void_p to c_handle
      9a00cf19
  10. 06 Sep, 2022 1 commit
    • Chang Liu's avatar
      [Feature] Unify the cuda stream used in core library (#4480) · 1c9d2a03
      Chang Liu authored
      
      
      * Use an internal cuda stream for CopyDataFromTo
      
      * small fix white space
      
      * Fix to compile
      
      * Make stream optional in copydata for compile
      
      * fix lint issue
      
      * Update cub functions to use internal stream
      
      * Lint check
      
      * Update CopyTo/CopyFrom/CopyFromTo to use internal stream
      
      * Address comments
      
      * Fix backward CUDA stream
      
      * Avoid overloading CopyFromTo()
      
      * Minor comment update
      
      * Overload copydatafromto in cuda device api
      Co-authored-by: default avatarxiny <xiny@nvidia.com>
      1c9d2a03
  11. 29 Jul, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Add CUDA Weighted Neighborhood Sampling (#4064) · 86c81b4e
      Xin Yao authored
      
      
      * add weighted sampling without replacement (A-Chao)
      
      * improve Algorithm A-Chao with block-wise prefix sum
      
      * correctly fill out_idxs
      
      * implement weighted sampling with replacement
      
      * small fix
      
      * merge host-side code of weighted/uniform sampling
      
      * enable unit tests for cuda weighted sampling
      
      * move thrust/cub wrapper to the cmake file
      
      * update docs accordingly
      
      * fix linting
      
      * fix linting
      
      * fix unit test
      
      * Bump external CUB/Thrust versions
      
      * Fix code style and update description of algorithm design
      
      * [Feature] GPU support weighted graph neighbor sampling
      commit by pengqirong(OPPO)
      
      * merge pengqirong's implementation
      
      * revert the change to cub and thrust
      
      * fix linting
      
      * use DeviceSegmentedSort for better performance
      
      * add more comments
      
      * add necessary notes
      
      * add necessary notes
      
      * resolve some comments
      
      * define THRUST_CUB_WRAPPED_NAMESPACE
      
      * fix doc
      Co-authored-by: default avatar彭齐荣 <657017034@qq.com>
      86c81b4e
  12. 17 May, 2022 1 commit
  13. 10 Mar, 2022 1 commit
  14. 09 Feb, 2022 1 commit
    • Xin Yao's avatar
      [Feature] CUDA UVA sampling for MultiLayerNeighborSampler (#3674) · 738e8318
      Xin Yao authored
      
      
      * implement pin_memory/unpin_memory/is_pinned for dgl.graph
      
      * update python docstring
      
      * update c++ docstring
      
      * add test
      
      * fix the broken UnifiedTensor
      
      * XPU_SWITCH for kDLCPUPinned
      
      * a rough version ready for testing
      
      * eliminate extra context parameter for pin/unpin
      
      * update train_sampling
      
      * fix linting
      
      * fix typo
      
      * multi-gpu uva sampling case
      
      * disable new format materialization for pinned graphs
      
      * update python doc for pin_memory_
      
      * fix unit test
      
      * UVA sampling for link prediction
      
      * dispatch most csr ops
      
      * update graphsage example to combine uva sampling and UnifiedTensor
      
      * update graphsage example to combine uva sampling and UnifiedTensor
      
      * update graphsage example to combine uva sampling and UnifiedTensor
      
      * update doc
      
      * update examples
      
      * change unitgraph and heterograph's PinMemory to in-place
      
      * update examples for multi-gpu uva sampling
      
      * update doc
      
      * fix linting
      
      * fix cpu build
      
      * fix is_pinned for DistGraph
      
      * fix is_pinned for DistGraph
      
      * update graphsage unsupervised example
      
      * update doc for gpu sampling
      
      * update some check for sampling device switching
      
      * fix linting
      
      * adapt for new dataloader
      
      * fix linting
      
      * fix
      
      * fix some name issue
      
      * adjust device check
      
      * add unit test for uva sampling & fix some zero_copy bug
      
      * fix linting
      
      * update num_threads in graphsage examples
      Co-authored-by: default avatarQuan (Andy) Gan <coin2028@hotmail.com>
      Co-authored-by: default avatarJinjing Zhou <VoVAllen@users.noreply.github.com>
      738e8318
  15. 06 Sep, 2021 1 commit
  16. 02 Aug, 2021 1 commit
  17. 08 Jul, 2021 1 commit
  18. 15 Apr, 2021 1 commit
    • nv-dlasalle's avatar
      [Performance][GPU] Enable GPU uniform edge sampling (#2716) · e70138bb
      nv-dlasalle authored
      
      
      * Start on uniform GPU sampling
      
      * Save more work
      
      * Get cu file compiling
      
      * Update sampling
      
      * More changes
      
      * Get GPU sampling for uniform probabilities solved
      
      * Fix batch tensor migration
      
      * Fix
      
      * update kernels
      
      * expand blocking
      
      * Undo testing change
      
      * Cut down on sampling overhead
      
      * Fix replacement
      
      * Update unit tests
      
      * Add option to gpu sample in graphsage
      
      * Copy only csc to gpu
      
      * Add ogbn support
      
      * Fix linting
      
      * Remove nvtx from sample
      
      * Improve documentation and error checking
      
      * Expand documentation
      
      * Update assert checking
      
      * delete extra space
      
      * Use standard dataloader when dataset is a dictionary
      
      * ogb -> ogbn
      
      * Fix edge selection determinism
      
      * Fix typos
      
      * Remove nvtx
      
      * Add comment for self.fanout_arrays and assert
      
      * Fix linting
      
      * Migrate to scalarbatcher
      
      * Fix indentation
      
      * Fix batcher
      
      * Fix indexing
      
      * Only use databatcher for GPU
      
      * Convert to DGL NDArray to PyTorch Tensor
      
      * Add optimization for PyTorch's F.tensor() for list of GPU tensors
      Co-authored-by: default avatarDa Zheng <zhengda1936@gmail.com>
      e70138bb