1. 09 Aug, 2022 1 commit
  2. 29 Jul, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Add CUDA Weighted Neighborhood Sampling (#4064) · 86c81b4e
      Xin Yao authored
      
      
      * add weighted sampling without replacement (A-Chao)
      
      * improve Algorithm A-Chao with block-wise prefix sum
      
      * correctly fill out_idxs
      
      * implement weighted sampling with replacement
      
      * small fix
      
      * merge host-side code of weighted/uniform sampling
      
      * enable unit tests for cuda weighted sampling
      
      * move thrust/cub wrapper to the cmake file
      
      * update docs accordingly
      
      * fix linting
      
      * fix linting
      
      * fix unit test
      
      * Bump external CUB/Thrust versions
      
      * Fix code style and update description of algorithm design
      
      * [Feature] GPU support weighted graph neighbor sampling
      commit by pengqirong(OPPO)
      
      * merge pengqirong's implementation
      
      * revert the change to cub and thrust
      
      * fix linting
      
      * use DeviceSegmentedSort for better performance
      
      * add more comments
      
      * add necessary notes
      
      * add necessary notes
      
      * resolve some comments
      
      * define THRUST_CUB_WRAPPED_NAMESPACE
      
      * fix doc
      Co-authored-by: default avatar彭齐荣 <657017034@qq.com>
      86c81b4e
  3. 15 Jul, 2022 1 commit
  4. 27 Jun, 2022 1 commit
    • ndickson-nvidia's avatar
      [Bug][Feature] Added more missing FP16 specializations (#4140) · a5d8460c
      ndickson-nvidia authored
      * * Added missing specializations for `__half` of `DLDataTypeTraits`, `IndexSelect`, `Full`, `Scatter_`, `CSRGetData`, `CSRMM`, `CSRSum`, `IndexSelectCPUFromGPU`
      * Fixed casting issue in `_LinearSearchKernel` that was preventing it from supporting `__half`
      * Added `#if`'d out specializations of `CSRGEMM`, `CSRGEAM`, and `Xgeam`, which would require functions that aren't currently provided by cublas
      
      * * Added more specific error messages for unimplemented FP16 specializations of Xgeam, CSRGEMM, and CSRGEAM
      
      * * Added missing instantiation of DLDataTypeTraits<__half>::dtype
      
      * * Fixed linter error
      * Added clearer comment explaining why the cast to long long is necessary
      
      * * Worked around a compile error in some particular setup, where __half can't be constructed on the host side
      
      * * Fixed linter formatting errors
      
      * * Changes to comments as recommended
      
      * * Made recommended changes to logging errors in FP16 specializations
      * Also changed the existing Xgeam function for unsupported data types from LOG(INFO) to LOG(FATAL)
      a5d8460c
  5. 24 Jun, 2022 1 commit
    • nv-dlasalle's avatar
      [Performance][Optimizer] Enable using UVA and FP16 with SparseAdam Optimizer (#3885) · 020f0249
      nv-dlasalle authored
      
      
      * Add uva by default to embedding
      
      * More updates
      
      * Update optimizer
      
      * Add new uva functions
      
      * Expose new pinned memory function
      
      * Add unit tests
      
      * Update formatting
      
      * Fix unit test
      
      * Handle auto UVA case when training is on CPU
      
      * Allow per-embedding decisions for whether to use UVA
      
      * Address spares_optim.py comments
      
      * Remove unused templates
      
      * Update unit test
      
      * Use dgl allocate memory for pinning
      
      * allow automatically unpin
      
      * workaround for d2h copy with a different dtype
      
      * fix linting
      
      * update error message
      
      * update copyright
      Co-authored-by: default avatarXin Yao <xiny@nvidia.com>
      Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
      020f0249
  6. 23 Jun, 2022 1 commit
    • Triston's avatar
      [Fix] Fix compiler warnings - part 1 (#4051) · 1ad65879
      Triston authored
      
      
      * Fix a cub compile error for CUDA 11.5
      
      * Fix comparison of integer expressions of different signedness in coo_sort.cu file
      
      * Fix comparison of integer expressions of different signedness in cuda_compact_graph.cu file
      
      * Remove never referenced variable in spmm.cu
      
      * Fix comparison of integer expressions of different signedness in rowwise_pick.h file
      
      * Fix comparison of integer expressions of different signedness in choice.cc file
      
      * Remove never referenced variable col_data in spat_op_impl_coo.cc
      
      * Remove never referenced variable allowed in global_uniform.cc
      
      * Fix comparison of integer expressions of different signedness in graph.cc
      
      * Fix comparison of integer expressions of different signedness in graph_apis.cc
      
      * Fix the un-used ctx variable in ndarray_partition.cc file for cpu only build
      
      * Fix comparison of integer expressions of different signedness in libra_partition.cc
      
      * Fix comparison of integer expressions of different signedness in graph_op.cc
      Co-authored-by: default avatarTriston Cao <tristonc@nvidia.com>
      Co-authored-by: default avatarQuan (Andy) Gan <coin2028@hotmail.com>
      1ad65879
  7. 14 Jun, 2022 1 commit
  8. 11 Jun, 2022 1 commit
  9. 07 Jun, 2022 1 commit
  10. 06 Jun, 2022 2 commits
  11. 26 May, 2022 1 commit
  12. 17 May, 2022 1 commit
  13. 16 May, 2022 1 commit
  14. 26 Apr, 2022 1 commit
  15. 10 Mar, 2022 1 commit
  16. 28 Feb, 2022 1 commit
  17. 23 Feb, 2022 1 commit
    • Minjie Wang's avatar
      [NN] Rework RelGraphConv and HGTConv (#3742) · 0227ddfb
      Minjie Wang authored
      * WIP: TypedLinear and new RelGraphConv
      
      * wip
      
      * further simplify RGCN
      
      * a bunch of tweak for performance; add basic cpu support
      
      * update on segmm
      
      * wip: segment.cu
      
      * new backward kernel works
      
      * fix a bunch of bugs in kernel; leave idx_a for future
      
      * add nn test for typed_linear
      
      * rgcn nn test
      
      * bugfix in corner case; update RGCN README
      
      * doc
      
      * fix cpp lint
      
      * fix lint
      
      * fix ut
      
      * wip: hgtconv; presorted flag for rgcn
      
      * hgt code and ut; WIP: some fix on reorder graph
      
      * better typed linear init
      
      * fix ut
      
      * fix lint; add docstring
      0227ddfb
  18. 21 Feb, 2022 1 commit
    • Quan (Andy) Gan's avatar
      [Bugfix] Bug fixes in new dataloader (#3727) · 3f138eba
      Quan (Andy) Gan authored
      
      
      * fixes
      
      * fix
      
      * more fixes
      
      * update
      
      * oops
      
      * lint?
      
      * temporarily revert - will fix in another PR
      
      * more fixes
      
      * skipping mxnet test
      
      * address comments
      
      * fix DDP
      
      * fix edge dataloader exclusion problems
      
      * stupid bug
      
      * fix
      
      * use_uvm option
      
      * fix
      
      * fixes
      
      * fixes
      
      * fixes
      
      * fixes
      
      * add evaluation for cluster gcn and ddp
      
      * stupid bug again
      
      * fixes
      
      * move sanity checks to only support DGLGraphs
      
      * pytorch lightning compatibility fixes
      
      * remove
      
      * poke
      
      * more fixes
      
      * fix
      
      * fix
      
      * disable test
      
      * docstrings
      
      * why is it getting a memory leak?
      
      * fix
      
      * update
      
      * updates and temporarily disable forkingpickler
      
      * update
      
      * fix?
      
      * fix?
      
      * oops
      
      * oops
      
      * fix
      
      * lint
      
      * huh
      
      * uh
      
      * update
      
      * fix
      
      * made it memory efficient
      
      * refine exclude interface
      
      * fix tutorial
      
      * fix tutorial
      
      * fix graph duplication in CPU dataloader workers
      
      * lint
      
      * lint
      
      * Revert "lint"
      
      This reverts commit 805484dd553695111b5fb37f2125214a6b7276e9.
      
      * Revert "lint"
      
      This reverts commit 0bce411b2b415c2ab770343949404498436dc8b2.
      
      * Revert "fix graph duplication in CPU dataloader workers"
      
      This reverts commit 9e3a8cf34c175d3093c773f6bb023b155f2bd27f.
      Co-authored-by: default avatarxiny <xiny@nvidia.com>
      Co-authored-by: default avatarJinjing Zhou <VoVAllen@users.noreply.github.com>
      3f138eba
  19. 18 Feb, 2022 1 commit
  20. 15 Feb, 2022 1 commit
    • Israt Nisa's avatar
      [Feature] Gather mm (#3641) · b3d3a2c4
      Israt Nisa authored
      
      
      * init
      
      * init
      
      * working cublasGemm
      
      * benchmark high-mem/low-mem, err gather_mm output
      
      * cuda kernel for bmm like kernel
      
      * removed cpu copy for E_per_Rel
      
      * benchmark code from Minjie
      
      * fixed cublas results in gathermm sorted
      
      * use GPU shared mem in unsorted gather mm
      
      * minor
      
      * Added an optimal version of gather_mm_unsorted
      
      * lint
      
      * init gather_mm_scatter
      
      * cublas transpose added
      
      * fixed h_offset for multiple rel
      
      * backward unittest
      
      * cublas support to transpose W
      
      * adding missed file
      
      * forgot to add header file
      
      * lint
      
      * lint
      
      * cleanup
      
      * lint
      
      * docstring
      
      * lint
      
      * added unittest
      
      * lint
      
      * lint
      
      * unittest
      
      * changed err type
      
      * skip cpu test
      
      * skip CPU code
      
      * move in-len loop inside
      
      * lint
      
      * added check different dim length for B
      
      * w_per_len is optional now
      
      * moved gather_mm to pytorch/backend with backward support
      
      * removed a_/b_trans support
      
      * transpose op inside GEMM call
      
      * removed out alloc from API, changed W 2D to 3D
      
      * Added se_gather_mm, Separate API for sortedE
      
      * Fixed gather_mm (unsorted) user interface
      
      * unsorted gmm backward + separate CAPI for un/sorted A
      
      * typecast to float to support atomicAdd
      
      * lint typecast
      
      * lint
      
      * added gather_mm_scatter
      
      * minor
      
      * const
      
      * design changes
      
      * Added idx_a, idx_b support gmm_scatter
      
      * dgl doc
      
      * lint
      
      * adding gather_mm in ops
      
      * lint
      
      * lint
      
      * minor
      
      * removed benchmark files
      
      * minor
      
      * empty commit
      Co-authored-by: default avatarIsrat Nisa <nisisrat@amazon.com>
      b3d3a2c4
  21. 09 Feb, 2022 1 commit
    • Xin Yao's avatar
      [Feature] CUDA UVA sampling for MultiLayerNeighborSampler (#3674) · 738e8318
      Xin Yao authored
      
      
      * implement pin_memory/unpin_memory/is_pinned for dgl.graph
      
      * update python docstring
      
      * update c++ docstring
      
      * add test
      
      * fix the broken UnifiedTensor
      
      * XPU_SWITCH for kDLCPUPinned
      
      * a rough version ready for testing
      
      * eliminate extra context parameter for pin/unpin
      
      * update train_sampling
      
      * fix linting
      
      * fix typo
      
      * multi-gpu uva sampling case
      
      * disable new format materialization for pinned graphs
      
      * update python doc for pin_memory_
      
      * fix unit test
      
      * UVA sampling for link prediction
      
      * dispatch most csr ops
      
      * update graphsage example to combine uva sampling and UnifiedTensor
      
      * update graphsage example to combine uva sampling and UnifiedTensor
      
      * update graphsage example to combine uva sampling and UnifiedTensor
      
      * update doc
      
      * update examples
      
      * change unitgraph and heterograph's PinMemory to in-place
      
      * update examples for multi-gpu uva sampling
      
      * update doc
      
      * fix linting
      
      * fix cpu build
      
      * fix is_pinned for DistGraph
      
      * fix is_pinned for DistGraph
      
      * update graphsage unsupervised example
      
      * update doc for gpu sampling
      
      * update some check for sampling device switching
      
      * fix linting
      
      * adapt for new dataloader
      
      * fix linting
      
      * fix
      
      * fix some name issue
      
      * adjust device check
      
      * add unit test for uva sampling & fix some zero_copy bug
      
      * fix linting
      
      * update num_threads in graphsage examples
      Co-authored-by: default avatarQuan (Andy) Gan <coin2028@hotmail.com>
      Co-authored-by: default avatarJinjing Zhou <VoVAllen@users.noreply.github.com>
      738e8318
  22. 21 Jan, 2022 1 commit
    • Xin Yao's avatar
      [Feature] Pin dgl.graph to the page-locked memory (#3616) · 40b44a43
      Xin Yao authored
      
      
      * implement pin_memory/unpin_memory/is_pinned for dgl.graph
      
      * update python docstring
      
      * update c++ docstring
      
      * add test
      
      * fix the broken UnifiedTensor
      
      * eliminate extra context parameter for pin/unpin
      
      * fix linting
      
      * fix typo
      
      * disable new format materialization for pinned graphs
      
      * update python doc for pin_memory_
      
      * fix unit test
      
      * update doc
      
      * change unitgraph and heterograph's PinMemory to in-place
      
      * update comments for NDArray's PinMemory_ and PinData
      
      * update doc
      Co-authored-by: default avatarJinjing Zhou <VoVAllen@users.noreply.github.com>
      40b44a43
  23. 17 Jan, 2022 2 commits
  24. 10 Jan, 2022 1 commit
  25. 07 Jan, 2022 1 commit
    • Quan (Andy) Gan's avatar
      [Feature] Negative sampling (#3599) · 90f10b31
      Quan (Andy) Gan authored
      * first commit
      
      * a bunch of fixes
      
      * add unique
      
      * lint
      
      * lint
      
      * lint
      
      * address comments
      
      * Update negative_sampler.py
      
      * fix
      
      * description
      
      * address comments and fix
      
      * fix
      
      * replace unique with replace
      
      * test pylint
      
      * Update negative_sampler.py
      90f10b31
  26. 16 Dec, 2021 1 commit
  27. 03 Dec, 2021 1 commit
    • Israt Nisa's avatar
      [Feature] Add Min/max reducer in heterogeneous API for unary message functions (#3514) · cb0e1103
      Israt Nisa authored
      
      
      * min/max support for forward CPU heterograph
      
      * Added etype with each argU values
      
      * scatter_add needs fix
      
      * added scatter_add_hetero. Grads dont match for max reducer
      
      * storing ntype in argX
      
      * fixing scatter_add_hetero
      
      * hetero matches with torch's scatter add
      
      * works copy_e forward+cpu
      
      * added backward for copy_rhs
      
      * Computes gradient for all node types in one kernel
      
      * bug fix
      
      * unnitest for max/min on CPU
      
      * renamed scatter_add_hetero to update_grad_minmax_hetero
      
      * lint check and comment out cuda call for max. Code is for CPU only
      
      * lint check
      
      * replace inf with zero
      
      * minor
      
      * lint check
      
      * removed LIBXSMM code from hetro code
      
      * fixing backward operator of UpdateGradMinMaxHetero
      
      * removed backward from update_grad_minmax_hetero
      
      * docstring
      
      * improved docstring and coding style
      
      * Added pass by pointer for output
      
      * typos and pass by references
      
      * Support for copy_rhs
      
      * Added header <string>
      
      * fix bug in copy_u_max
      
      * Added comments and dimension check of all etypes
      
      * skip mxnet check
      
      * pass by pointer output arrays
      
      * updated docstring
      Co-authored-by: default avatarIsrat Nisa <nisisrat@amazon.com>
      Co-authored-by: default avatarQuan (Andy) Gan <coin2028@hotmail.com>
      cb0e1103
  28. 30 Nov, 2021 1 commit
  29. 17 Nov, 2021 1 commit
  30. 06 Nov, 2021 1 commit
  31. 04 Nov, 2021 1 commit
  32. 03 Nov, 2021 1 commit
  33. 15 Oct, 2021 1 commit
  34. 07 Sep, 2021 1 commit
  35. 06 Sep, 2021 1 commit
  36. 24 Aug, 2021 1 commit
  37. 19 Aug, 2021 1 commit
  38. 18 Aug, 2021 1 commit