1. 15 Sep, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Import PyTorch's CUDA stream management (#4503) · 9a00cf19
      Xin Yao authored
      * add set_stream
      
      * add .record_stream for NDArray and HeteroGraph
      
      * refactor dgl stream Python APIs
      
      * test record_stream
      
      * add unit test for record stream
      
      * use pytorch's stream
      
      * fix lint
      
      * fix cpu build
      
      * address comments
      
      * address comments
      
      * add record stream tests for dgl.graph
      
      * record frames and update dataloder
      
      * add docstring
      
      * update frame
      
      * add backend check for record_stream
      
      * remove CUDAThreadEntry::stream
      
      * record stream for newly created formats
      
      * fix bug
      
      * fix cpp test
      
      * fix None c_void_p to c_handle
      9a00cf19
  2. 06 Sep, 2022 1 commit
    • Chang Liu's avatar
      [Feature] Unify the cuda stream used in core library (#4480) · 1c9d2a03
      Chang Liu authored
      
      
      * Use an internal cuda stream for CopyDataFromTo
      
      * small fix white space
      
      * Fix to compile
      
      * Make stream optional in copydata for compile
      
      * fix lint issue
      
      * Update cub functions to use internal stream
      
      * Lint check
      
      * Update CopyTo/CopyFrom/CopyFromTo to use internal stream
      
      * Address comments
      
      * Fix backward CUDA stream
      
      * Avoid overloading CopyFromTo()
      
      * Minor comment update
      
      * Overload copydatafromto in cuda device api
      Co-authored-by: default avatarxiny <xiny@nvidia.com>
      1c9d2a03
  3. 27 Jun, 2022 1 commit
    • ndickson-nvidia's avatar
      [Bug][Feature] Added more missing FP16 specializations (#4140) · a5d8460c
      ndickson-nvidia authored
      * * Added missing specializations for `__half` of `DLDataTypeTraits`, `IndexSelect`, `Full`, `Scatter_`, `CSRGetData`, `CSRMM`, `CSRSum`, `IndexSelectCPUFromGPU`
      * Fixed casting issue in `_LinearSearchKernel` that was preventing it from supporting `__half`
      * Added `#if`'d out specializations of `CSRGEMM`, `CSRGEAM`, and `Xgeam`, which would require functions that aren't currently provided by cublas
      
      * * Added more specific error messages for unimplemented FP16 specializations of Xgeam, CSRGEMM, and CSRGEAM
      
      * * Added missing instantiation of DLDataTypeTraits<__half>::dtype
      
      * * Fixed linter error
      * Added clearer comment explaining why the cast to long long is necessary
      
      * * Worked around a compile error in some particular setup, where __half can't be constructed on the host side
      
      * * Fixed linter formatting errors
      
      * * Changes to comments as recommended
      
      * * Made recommended changes to logging errors in FP16 specializations
      * Also changed the existing Xgeam function for unsupported data types from LOG(INFO) to LOG(FATAL)
      a5d8460c
  4. 04 Nov, 2021 1 commit
  5. 27 Apr, 2021 1 commit
  6. 10 Sep, 2020 1 commit
  7. 30 Jul, 2020 1 commit
  8. 28 Jun, 2020 1 commit
    • Minjie Wang's avatar
      [CUDA][Kernel] More CUDA kernels; Standardize the behavior for sorted COO/CSR (#1704) · 870da747
      Minjie Wang authored
      * add cub; array cumsum
      
      * CSRSliceRows
      
      * fix warning
      
      * operator << for ndarray; CSRSliceRows
      
      * add CSRIsSorted
      
      * add csr_sort
      
      * inplace coosort and outplace csrsort
      
      * WIP: coo is sorted
      
      * mv cuda_utils
      
      * add AllTrue utility
      
      * csr sort
      
      * coo sort
      
      * coo2csr for sorted coo arrays
      
      * CSRToCOO from sorted
      
      * pass tests for the new kernel changes
      
      * cannot use inplace sort
      
      * lint
      
      * try fix msvc error
      
      * Fix g.copy_to and g.asnumbits; ToBlock no longer uses CSC
      
      * stash
      
      * revert some hack
      
      * revert some changes
      
      * address comments
      
      * fix
      
      * fix to_block unittest
      
      * add todo note
      870da747
  9. 19 Jun, 2020 1 commit
    • Minjie Wang's avatar
      [CUDA] Many CUDA operators; Prepare for DGLGraph on CUDA (#1660) · f1b19a6b
      Minjie Wang authored
      * add cuda utils; change g.to; add g.device
      
      * split array.h into several headers
      
      * cuda index select
      
      * file
      
      * three cuda kernels
      
      * add cuda elementwise arith and several others
      
      * cuda CSRIsNonZero
      
      * fix lint
      
      * lint
      
      * lint
      
      * fix bug in changing ctx to property
      
      * address comments
      
      * remove unused codes
      
      * address comments
      f1b19a6b
  10. 15 Jun, 2020 1 commit