1. 06 Nov, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Add bfloat16 (bf16) support (#4648) · 96297fb8
      Xin Yao authored
      * add bf16 specializations
      
      * remove SWITCH_BITS
      
      * enable amp for bf16
      
      * remove SWITCH_BITS for cpu kernels
      
      * enbale bf16 based on CUDART
      
      * fix compiling for sm<80
      
      * fix cpu build
      
      * enable unit tests
      
      * update doc
      
      * disable test for CUDA < 11.0
      
      * address comments
      
      * address comments
      96297fb8
  2. 19 Sep, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) · cded5b80
      Xin Yao authored
      * rename `DLContext` to `DGLContext`
      
      * rename `kDLGPU` to `kDLCUDA`
      
      * replace DLTensor with DGLArray
      
      * fix linting
      
      * Unify DGLType and DLDataType to DGLDataType
      
      * Fix FFI
      
      * rename DLDeviceType to DGLDeviceType
      
      * decouple dlpack from the core library
      
      * fix bug
      
      * fix lint
      
      * fix merge
      
      * fix build
      
      * address comments
      
      * rename dl_converter to dlpack_convert
      
      * remove redundant comments
      cded5b80
  3. 23 Feb, 2022 1 commit
    • Minjie Wang's avatar
      [NN] Rework RelGraphConv and HGTConv (#3742) · 0227ddfb
      Minjie Wang authored
      * WIP: TypedLinear and new RelGraphConv
      
      * wip
      
      * further simplify RGCN
      
      * a bunch of tweak for performance; add basic cpu support
      
      * update on segmm
      
      * wip: segment.cu
      
      * new backward kernel works
      
      * fix a bunch of bugs in kernel; leave idx_a for future
      
      * add nn test for typed_linear
      
      * rgcn nn test
      
      * bugfix in corner case; update RGCN README
      
      * doc
      
      * fix cpp lint
      
      * fix lint
      
      * fix ut
      
      * wip: hgtconv; presorted flag for rgcn
      
      * hgt code and ut; WIP: some fix on reorder graph
      
      * better typed linear init
      
      * fix ut
      
      * fix lint; add docstring
      0227ddfb
  4. 15 Feb, 2022 1 commit
    • Israt Nisa's avatar
      [Feature] Gather mm (#3641) · b3d3a2c4
      Israt Nisa authored
      
      
      * init
      
      * init
      
      * working cublasGemm
      
      * benchmark high-mem/low-mem, err gather_mm output
      
      * cuda kernel for bmm like kernel
      
      * removed cpu copy for E_per_Rel
      
      * benchmark code from Minjie
      
      * fixed cublas results in gathermm sorted
      
      * use GPU shared mem in unsorted gather mm
      
      * minor
      
      * Added an optimal version of gather_mm_unsorted
      
      * lint
      
      * init gather_mm_scatter
      
      * cublas transpose added
      
      * fixed h_offset for multiple rel
      
      * backward unittest
      
      * cublas support to transpose W
      
      * adding missed file
      
      * forgot to add header file
      
      * lint
      
      * lint
      
      * cleanup
      
      * lint
      
      * docstring
      
      * lint
      
      * added unittest
      
      * lint
      
      * lint
      
      * unittest
      
      * changed err type
      
      * skip cpu test
      
      * skip CPU code
      
      * move in-len loop inside
      
      * lint
      
      * added check different dim length for B
      
      * w_per_len is optional now
      
      * moved gather_mm to pytorch/backend with backward support
      
      * removed a_/b_trans support
      
      * transpose op inside GEMM call
      
      * removed out alloc from API, changed W 2D to 3D
      
      * Added se_gather_mm, Separate API for sortedE
      
      * Fixed gather_mm (unsorted) user interface
      
      * unsorted gmm backward + separate CAPI for un/sorted A
      
      * typecast to float to support atomicAdd
      
      * lint typecast
      
      * lint
      
      * added gather_mm_scatter
      
      * minor
      
      * const
      
      * design changes
      
      * Added idx_a, idx_b support gmm_scatter
      
      * dgl doc
      
      * lint
      
      * adding gather_mm in ops
      
      * lint
      
      * lint
      
      * minor
      
      * removed benchmark files
      
      * minor
      
      * empty commit
      Co-authored-by: default avatarIsrat Nisa <nisisrat@amazon.com>
      b3d3a2c4