1. 06 Nov, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Add bfloat16 (bf16) support (#4648) · 96297fb8
      Xin Yao authored
      * add bf16 specializations
      
      * remove SWITCH_BITS
      
      * enable amp for bf16
      
      * remove SWITCH_BITS for cpu kernels
      
      * enbale bf16 based on CUDART
      
      * fix compiling for sm<80
      
      * fix cpu build
      
      * enable unit tests
      
      * update doc
      
      * disable test for CUDA < 11.0
      
      * address comments
      
      * address comments
      96297fb8
  2. 03 Nov, 2022 1 commit
  3. 19 Sep, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) · cded5b80
      Xin Yao authored
      * rename `DLContext` to `DGLContext`
      
      * rename `kDLGPU` to `kDLCUDA`
      
      * replace DLTensor with DGLArray
      
      * fix linting
      
      * Unify DGLType and DLDataType to DGLDataType
      
      * Fix FFI
      
      * rename DLDeviceType to DGLDeviceType
      
      * decouple dlpack from the core library
      
      * fix bug
      
      * fix lint
      
      * fix merge
      
      * fix build
      
      * address comments
      
      * rename dl_converter to dlpack_convert
      
      * remove redundant comments
      cded5b80
  4. 15 Sep, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Import PyTorch's CUDA stream management (#4503) · 9a00cf19
      Xin Yao authored
      * add set_stream
      
      * add .record_stream for NDArray and HeteroGraph
      
      * refactor dgl stream Python APIs
      
      * test record_stream
      
      * add unit test for record stream
      
      * use pytorch's stream
      
      * fix lint
      
      * fix cpu build
      
      * address comments
      
      * address comments
      
      * add record stream tests for dgl.graph
      
      * record frames and update dataloder
      
      * add docstring
      
      * update frame
      
      * add backend check for record_stream
      
      * remove CUDAThreadEntry::stream
      
      * record stream for newly created formats
      
      * fix bug
      
      * fix cpp test
      
      * fix None c_void_p to c_handle
      9a00cf19
  5. 27 Jun, 2022 1 commit
    • ndickson-nvidia's avatar
      [Bug][Feature] Added more missing FP16 specializations (#4140) · a5d8460c
      ndickson-nvidia authored
      * * Added missing specializations for `__half` of `DLDataTypeTraits`, `IndexSelect`, `Full`, `Scatter_`, `CSRGetData`, `CSRMM`, `CSRSum`, `IndexSelectCPUFromGPU`
      * Fixed casting issue in `_LinearSearchKernel` that was preventing it from supporting `__half`
      * Added `#if`'d out specializations of `CSRGEMM`, `CSRGEAM`, and `Xgeam`, which would require functions that aren't currently provided by cublas
      
      * * Added more specific error messages for unimplemented FP16 specializations of Xgeam, CSRGEMM, and CSRGEAM
      
      * * Added missing instantiation of DLDataTypeTraits<__half>::dtype
      
      * * Fixed linter error
      * Added clearer comment explaining why the cast to long long is necessary
      
      * * Worked around a compile error in some particular setup, where __half can't be constructed on the host side
      
      * * Fixed linter formatting errors
      
      * * Changes to comments as recommended
      
      * * Made recommended changes to logging errors in FP16 specializations
      * Also changed the existing Xgeam function for unsupported data types from LOG(INFO) to LOG(FATAL)
      a5d8460c
  6. 17 May, 2021 1 commit
  7. 27 Apr, 2021 1 commit