Commits · 70a499e388d77292e859533dad3968525f1a6cdb · OpenDAS / dgl

16 Dec, 2021 1 commit

[Feature] Add CUDA support for `min` and `max` reducer in heterogeneous API... · 70a499e3

Israt Nisa authored Dec 16, 2021


[Feature] Add CUDA support for `min` and `max` reducer in heterogeneous API for unary message functions (#3566)

* CUDA support max/min reducer on forward pass

* docstring

* concised UpdateGradMinMax_hetero

* reorganized UpdateGradMinMax_hetero

* CUDA kernels for max/min reducer

* variable name

* lint check

* changed CUDA 2D thread mapping to 1D

* removed legacy cusparse for min/max reducer

* git CI issue

* restarting git CI

* adding namespace std
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

70a499e3

15 Dec, 2021 2 commits

[PinSAGESampler] support PinSAGE sampler on GPU (#3567) · dd762a1e

lixiaobai authored Dec 15, 2021



* Feat: support API "randomwalk_topk" in library

* Feat: use the new API "randomwalk_topk" for PinSAGESampler

* Minor

* Minor

* Refactor: modified codes as checker required

* Minor

* Minor

* Minor

* Minor

* Fix: checking errors in RandomWalkTopk

* Refactor: modified the docstring for randomwalk_topk

* change randomwalk_topk to internal

* fix

* rename

* Minor for pinsage.py

* Feat: support randomwalk and SelectPinSageNeighbors on GPU

Port RandomWalk algorithm on GPU,
and port SelectPinSageNeighbors on GPU.

* Feat: support GPU on python APIs

* Feat: remove perf print information in FrequenchHashmap

* Fix: modified the code format

Modified the code format as task_lint.sh suggested

* Feat: let test script support PinSAGESampler on GPU

Let test script support PinSAGESampler on GPU,
minor of "restart_prob".

* Minor

* Minor

* Minor

* Refactor: use the atomic operations from the array module

* Minor: change the long lines

* Refactor: modified the get_node_types for gpu

* Feat: update the contributor date

* Perf: remove unnecessary stream sync

* Feat: support other random walk

But the non-uniform choice is still not supported.

* Fix: add CUDA switch for random walk
Co-authored-by: Quan Gan <coin2028@hotmail.com>

dd762a1e

[DistGNN, Graph partitioning] Libra partition (#3376) · 78e0dae6

Vasimuddin Md authored Dec 15, 2021



* added distgnn plus libra codebase

* Dist application codes

* added comments in partition code. changed the interface of partitioning call.

* updated readme

* create libra partitioning branch for the PR

* removed disgnn files for first PR

* updated kernel.cc

* added libra_partition.cc and moved libra code from kernel.cc to libra_partition.cc

* fixed lint error; merged libra2dgl.py and main_Libra.py to libra_partition.py; added graphsage/distgnn folder and partition script.

* removed libra2dgl.py

* fixed the lint error and cleaned the code.

* revisions due to PR comments. added distgnn/tools contains partitions routines

* update 2 PR revision I

* fixed errors; also improved the runtime by 10x.

* fixed minor lint error

* fixed some more lints

* PR revision II changed the interface of libra partition function

* rewrite docstring
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

78e0dae6

08 Dec, 2021 1 commit
- [Bugfix] Fix SetDevice issue for NeighborMatching (#3341) · d798280f
  Tianqi Zhang (张天启) authored Dec 08, 2021
```
* fix setdevice issue

* change to curand device API
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
```
  d798280f
06 Dec, 2021 2 commits

[RPC] Use tensorpipe for rpc communication (#3335) · a3ce780d

Jinjing Zhou authored Dec 06, 2021

* doesn't know whether works

* add change

* fix

* fix

* fix

* remove

* revert

* lint

* lint

* fix

* revert

* lint

* fix

* only build rpc on linux

* lint

* lint

* fix build on windows

* fix windows

* remove old test

* fix cmake

* Revert "remove old test"

This reverts commit f1ea75c777c34cdc1f08c0589676ba6aee1feb29.

* fix windows

* fix

* fix

* fix indent

* fix indent

* address comment

* fix

* fix

* fix

* fix

* fix

* lint

* fix indent

* fix lint

* add introduction

* fix

* lint

* lint

* add more logs

* fix

* update xbyak for C++14 with gcc5

* Remove channels

* fix

* add test script

* fix

* remove unused file

* fix lint

* add timeout

a3ce780d

[Distributed] Edge-type-specific fanouts for heterogeneous graphs (#3558) · eb08ef38
Quan (Andy) Gan authored Dec 06, 2021
```
* first commit

* second commit

* spaghetti unit tests

* rewrite test
```
eb08ef38

03 Dec, 2021 1 commit

[Feature] Add Min/max reducer in heterogeneous API for unary message functions (#3514) · cb0e1103

Israt Nisa authored Dec 03, 2021



* min/max support for forward CPU heterograph

* Added etype with each argU values

* scatter_add needs fix

* added scatter_add_hetero. Grads dont match for max reducer

* storing ntype in argX

* fixing scatter_add_hetero

* hetero matches with torch's scatter add

* works copy_e forward+cpu

* added backward for copy_rhs

* Computes gradient for all node types in one kernel

* bug fix

* unnitest for max/min on CPU

* renamed scatter_add_hetero to update_grad_minmax_hetero

* lint check and comment out cuda call for max. Code is for CPU only

* lint check

* replace inf with zero

* minor

* lint check

* removed LIBXSMM code from hetro code

* fixing backward operator of UpdateGradMinMaxHetero

* removed backward from update_grad_minmax_hetero

* docstring

* improved docstring and coding style

* Added pass by pointer for output

* typos and pass by references

* Support for copy_rhs

* Added header <string>

* fix bug in copy_u_max

* Added comments and dimension check of all etypes

* skip mxnet check

* pass by pointer output arrays

* updated docstring
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

cb0e1103

30 Nov, 2021 1 commit

[Performance][GPU] Improve csr2coo.cu:_RepeatKernel() for more robust GPU usage (#3537) · 66a54555

ayasar70 authored Nov 30, 2021



* Based on issue #3436. Improving _SegmentCopyKernel s GPU utilization by switching to nonzero based thread assignment

* fixing lint issues

* Update cub for cuda 11.5 compatibility (#3468)

* fixing type mismatch

* tx guaranteed to be smaller than nnz. Hence removing last check

* minor: updating comment

* adding three unit tests for csr slice method to cover some corner cases

* working on repeat

* updating repeat kernel

* removing unnecessary parameter

* cleaning commented line

* cleaning time measures

* cleaning time measurement lines
Co-authored-by: Abdurrahman Yasar <ayasar@nvidia.com>
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

66a54555

29 Nov, 2021 1 commit

[PinSAGE samper] Adjust the APIs for PinSAGESamper (#3529) · 44f0b5fe

lixiaobai authored Nov 30, 2021



* Feat: support API "randomwalk_topk" in library

* Feat: use the new API "randomwalk_topk" for PinSAGESampler

* Minor

* Minor

* Refactor: modified codes as checker required

* Minor

* Minor

* Minor

* Minor

* Fix: checking errors in RandomWalkTopk

* Refactor: modified the docstring for randomwalk_topk

* change randomwalk_topk to internal

* fix

* rename

* Minor for pinsage.py
Co-authored-by: Quan Gan <coin2028@hotmail.com>

44f0b5fe

17 Nov, 2021 1 commit

[Feature] Added heterograph support to SDDMM_COO and clean up SpMM and SDDMM hetero kernels (#3449) · 2150fcaf

Israt Nisa authored Nov 17, 2021



* Added SDDMMCOO_hetero support

* removed redundant CUDA kernels

* added benchmark for regression test

* fix

* fixed bug for single src node type
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

2150fcaf

15 Nov, 2021 1 commit
- [Randomwalk] Fix off-by-one bug in GenericRandomWalk() (#3500) · 2e8b56a3
  Eric Kim authored Nov 15, 2021
  
  2e8b56a3
10 Nov, 2021 1 commit
- [BugFix] fix in_degree/out_degree computation logic (#3477) · ea8b93f9
  Rhett Ying authored Nov 10, 2021
```
* [BugFix] fix in/out degree computation

* add unit tests
```
  ea8b93f9
06 Nov, 2021 1 commit

[Performance][GPU] Improve _SegmentCopyKernel() (#3470) · 96cd2ee6

ayasar70 authored Nov 06, 2021



* Based on issue #3436. Improving _SegmentCopyKernel s GPU utilization by switching to nonzero based thread assignment

* fixing lint issues

* Update cub for cuda 11.5 compatibility (#3468)

* fixing type mismatch

* tx guaranteed to be smaller than nnz. Hence removing last check

* minor: updating comment

* adding three unit tests for csr slice method to cover some corner cases
Co-authored-by: Abdurrahman Yasar <ayasar@nvidia.com>
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

96cd2ee6

04 Nov, 2021 2 commits

[BugFix] Fix bugs in GPU sampling and enable unit tests for dataloaders on the GPU (#3474) · b717c8bf

Xin Yao authored Nov 05, 2021



* enable unit tests for dataloader on the GPU

* fix compatibility

* copyright

* fix linting
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>

b717c8bf

[Feature] aten::Relabel_() for the GPU (#3445) · d3ae7544

Xin Yao authored Nov 04, 2021



* relabel gpu

* unittest for ralebl_ on the GPU

* finish Relabel_ for the GPU

* copyright

* re-enable the unittest for edge_subgrah on the GPU

* fix unittest for tensorflow

* use a fixed number of threads
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

d3ae7544

03 Nov, 2021 1 commit
- Update cub for cuda 11.5 compatibility (#3468) · f5102145
  nv-dlasalle authored Nov 02, 2021
  
  f5102145
21 Oct, 2021 1 commit

[Sampling] Implement dgl.compact_graphs() for the GPU (#3423) · a8c81018

Xin Yao authored Oct 21, 2021

* gpu compact graph template

* cuda compact graph draft

* fix typo

* compact graphs

* pass unit test but fail in training

* example using EdgeDataLoader on the GPU

* refactor cuda_compact_graph and cuda_to_block

* update training scripts

* fix linting

* fix linting

* fix exclude_edges for the GPU

* add --data-cpu & fix copyright

a8c81018

18 Oct, 2021 2 commits

[Fix] Split nccl sparse push into two groups (#3404) · c560040f
nv-dlasalle authored Oct 18, 2021

c560040f

[Peformance] Parallelize CSRSliceRows() (#3409) · aa11aaa4

David Min authored Oct 18, 2021



* parallelize CSRRowSlice()

* use parallel_for for the second loop
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

aa11aaa4

15 Oct, 2021 1 commit

[Bugfix] Add UVM specialized IndexSelect kernels which perform boundary checks (#3293) · 4f5c3aa2

David Min authored Oct 15, 2021



* Add pytorch-direct version

* remove

* add documentation for UnifiedTensor

* Revert "add documentation for UnifiedTensor"

This reverts commit 63ba42644d4aba197c1cb4ea4b85fa1bc43b8849.

* add boundary check for UVM IndexSelect

* relocate boundary check index kernels to cuda

* fix function name

* fix indexkernel in nccl api

* fix argument ordering

* simplify code

* Add a comment for the uvm version
Co-authored-by: shhssdm <shhssdm@gmail.com>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

4f5c3aa2

14 Oct, 2021 1 commit

[Bugfix] three bugs related to using DGL as a subdirectory(third_party) of another project. (#3379) · 18863069

zexi yuan authored Oct 14, 2021

* [Bugfix] fix a compile error for Debug-BuildType on Windows Platform

When using CMakeLists.txt to build the "Debug" BuildType on the Windows Platform, it has three compile errors (C4716) in the file "dgl\src\runtime\shared_mem.cc":

'dgl::runtime::SharedMemory::CreateNew': must return a value
'dgl::runtime::SharedMemory::Open': must return a value
'dgl::runtime::SharedMemory::Exist': must return a value

* [Bugfix] cmake error "cannot find load file" when DGL as a sub_directory on Linux

When using DGL as a subdirectory in a CMake Project, the "CMAKE_SOURCE_DIR" here will return the parent cmake scope dir, which is not a expected dir.
Maybe it is better to use "CMAKE_CURRENT_SOURCE_DIR" to set "GKLIB_PATH".

* [Bugfix] cmd cmake error when DGL as a subdirectory

When DGL as a subdirectory of another project, the WORKING_DIRECTORY of "add_custom_command" will be incorrect at the line 255 of "CMakeLists.txt", such that making a cmake "setlocal" error.

18863069

12 Oct, 2021 1 commit
- [Bug] check dtype before convert to gk (#3414) · 2d88db5a
  Rhett Ying authored Oct 12, 2021
  
  2d88db5a
29 Sep, 2021 1 commit

[Feature] enable create/set/free cuda stream for internal use (#3334) · e234fcfa

Rhett Ying authored Sep 29, 2021

* [Feature] enable create/set/free cuda stream for internal use

* add unit test

* fix unit test failure on mxnet and tf

* refactor stream wrapper

* fix lint error

* fix lint error

e234fcfa

28 Sep, 2021 1 commit
- [Feature] Implement one thread multiple socket (#3200) · 5cf48fc6
  Jingcheng Yu authored Sep 28, 2021
```
Co-authored-by: JingchengYu94 <jingchengyu94@gmail.com>
```
  5cf48fc6
22 Sep, 2021 1 commit
- [Feature] Graceful handling of exceptions thrown within OpenMP blocks (#3353) · a04a8d06
  Quan (Andy) Gan authored Sep 22, 2021
```
* graceful c++ exception in OpenMP

* credits

* add test
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
```
  a04a8d06
21 Sep, 2021 1 commit

[Feature] Exclude edges in sample_neighbors (#2971) · bc14829f

mszarma authored Sep 21, 2021



* [Feature] Exclude edges in sample_neighbors

Extending sample_neighbors and sample_frontier
API to support exclude_edges parameter.

exclude_edges support tensor and dict data
Feature enable excluding certain edges
during neighborhood sampling
Exclude_edges contains EID's of edges
which will be excluded
during neighbor picking for seed nodes.

Added test case for heterograph and homograph
RFC issue id: 2944

* compatibility

* fix

* fix
Co-authored-by: Quan Gan <coin2028@hotmail.com>

bc14829f

17 Sep, 2021 1 commit
- [BugFix] initialize data if null when converting from row sorted coo to csr (#3360) · bacc9047
  Rhett Ying authored Sep 17, 2021
  
  bacc9047
16 Sep, 2021 1 commit

[Performance][Feature] Add `src_nodes` paramter to `to_block()` to avoid cost... · 2647afc9

nv-dlasalle authored Sep 15, 2021


[Performance][Feature] Add `src_nodes` paramter to `to_block()` to avoid cost running unique() when available. (#2973)

* Add lhs_nodes are paremeter to to_block

* Update unit test

* Switch to simplified node conversion

* Switch lhs_nodes to be in/out parameter

* Update docs
Co-authored-by: Da Zheng <zhengda1936@gmail.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

2647afc9

14 Sep, 2021 1 commit

[Performance] improve coo2csr space complexity when row is not sorted (#3326) · f4c79f7f

Rhett Ying authored Sep 14, 2021



* [Performance] improve coo2csr space complexity when row is not sorted

* [Perf] replace std::vector<> by NDArray

* keep both impl of unsorted coo to csr and choose according to graph density dynamically

* refine criteria to choose btw Unsorted algos
Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-27.us-west-2.compute.internal>

f4c79f7f

13 Sep, 2021 2 commits
- Fixes bug #3312 (#3345) · 983a4fdd
  sanchit-misra authored Sep 13, 2021
```
* Fixes bug #3312

* Fixing lint errors
Co-authored-by: Mufei Li <mufeili1996@gmail.com>
```
  983a4fdd
- Fix openmp header (#3325) · e7ea0f53
  Quan (Andy) Gan authored Sep 13, 2021
  
  e7ea0f53
10 Sep, 2021 1 commit
- [Bugfix] Fix bugs of `farthest_point_sampler` (#3327) · 6454c791
  esang authored Sep 10, 2021
```
* fix start_idx

* fix the bug when cuda > 0
Co-authored-by: Tong He <hetong007@gmail.com>
```
  6454c791
07 Sep, 2021 1 commit

[Feature] Support builtin binary message function for heterogenenous graph (#3273) · 298e4fa6

Israt Nisa authored Sep 07, 2021



* Added binary builtinMsgFunc forward() for heterograph

* Added backward for u_op_v

* Supports all binary builtin forward

* Supports binary message funcs with reduce func sum

* lint check

* removed import torch from unittest

* enabled GPU test

* lint check

* Fixed docstrings

* rename func get_hs_id

* edited comment
Co-authored-by: Israt Nisa <nisisrat@amazon.com>

298e4fa6

06 Sep, 2021 1 commit
- Remove deprecated kernels (#3316) · c81efdf2
  Jinjing Zhou authored Sep 06, 2021
```
* remove

* remove

* fix

* remove

* remove
```
  c81efdf2
02 Sep, 2021 1 commit

[Performance, CPU] Rewriting OpenMP pragmas into parallel_for (#3171) · f5183820

Tomasz Patejko authored Sep 02, 2021

* [CPU, Parallel] Rewriting omp pragmas with parallel_for

* [CPU, Parallel] Decrease number of calls to task function

* c[CPU, Parallel] Modify calls to new interface of parallel_for

f5183820

01 Sep, 2021 2 commits

[Feature] enable to specify stream in UnitGraph::CopyTo() which could lead to async copy (#3297) · 5a245104
Rhett Ying authored Sep 01, 2021
```
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
```
5a245104

[Feature] Add a HINT for the per edge type sampler of heterogeneous DistGraph... · f4fe518f

xiang song(charlie.song) authored Sep 01, 2021


[Feature] Add a HINT for the per edge type sampler of heterogeneous DistGraph that highlighting the etypes are sorted already. (#3260)

* pass cpp test

* distgraph use sorted edge flag.

* lint

* triger

* update test
Co-authored-by: Ubuntu <ubuntu@ip-172-31-2-66.ec2.internal>

f4fe518f

31 Aug, 2021 1 commit

[CPU][Sampling][Performance] Improve sampling on the CPU. (#3274) · 8e525dad

nv-dlasalle authored Aug 31, 2021



* Optimize sampling

* Stop initialization of array

* Fix includes for linting

* Move comment

* Fix replace
Co-authored-by: Da Zheng <zhengda1936@gmail.com>

8e525dad

24 Aug, 2021 1 commit
- fix (#3286) · 85b8fe52
  Quan (Andy) Gan authored Aug 24, 2021
  
  85b8fe52
20 Aug, 2021 1 commit

[Feature][DistDGL] Add NCCL support for range based partitions (#3213) · 7f927939

nv-dlasalle authored Aug 19, 2021

* Implement range based NDArrayPartition

* Finish implement range based partition support

* Add unit test

* Fix whitepace

* Add Kernel suffix

* Fix argument passing

* Add doxygen docs and improve variable naming

* Add unit test

* Add function for converting a partition book

* Add example to partition_op docs

* Fix dtype conversion for mxnet and tensorflow

7f927939