Commits · f4c79f7f6df5cb84c3479918f8c756eee3bf9713 · OpenDAS / dgl

14 Sep, 2021 1 commit

[Performance] improve coo2csr space complexity when row is not sorted (#3326) · f4c79f7f

Rhett Ying authored Sep 14, 2021



* [Performance] improve coo2csr space complexity when row is not sorted

* [Perf] replace std::vector<> by NDArray

* keep both impl of unsorted coo to csr and choose according to graph density dynamically

* refine criteria to choose btw Unsorted algos
Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-27.us-west-2.compute.internal>

f4c79f7f

13 Sep, 2021 2 commits
- Fixes bug #3312 (#3345) · 983a4fdd
  sanchit-misra authored Sep 13, 2021
```
* Fixes bug #3312

* Fixing lint errors
Co-authored-by: Mufei Li <mufeili1996@gmail.com>
```
  983a4fdd
- Fix openmp header (#3325) · e7ea0f53
  Quan (Andy) Gan authored Sep 13, 2021
  
  e7ea0f53
10 Sep, 2021 1 commit
- [Bugfix] Fix bugs of `farthest_point_sampler` (#3327) · 6454c791
  esang authored Sep 10, 2021
```
* fix start_idx

* fix the bug when cuda > 0
Co-authored-by: Tong He <hetong007@gmail.com>
```
  6454c791
07 Sep, 2021 1 commit

[Feature] Support builtin binary message function for heterogenenous graph (#3273) · 298e4fa6

Israt Nisa authored Sep 07, 2021



* Added binary builtinMsgFunc forward() for heterograph

* Added backward for u_op_v

* Supports all binary builtin forward

* Supports binary message funcs with reduce func sum

* lint check

* removed import torch from unittest

* enabled GPU test

* lint check

* Fixed docstrings

* rename func get_hs_id

* edited comment
Co-authored-by: Israt Nisa <nisisrat@amazon.com>

298e4fa6

06 Sep, 2021 1 commit
- Remove deprecated kernels (#3316) · c81efdf2
  Jinjing Zhou authored Sep 06, 2021
```
* remove

* remove

* fix

* remove

* remove
```
  c81efdf2
02 Sep, 2021 1 commit

[Performance, CPU] Rewriting OpenMP pragmas into parallel_for (#3171) · f5183820

Tomasz Patejko authored Sep 02, 2021

* [CPU, Parallel] Rewriting omp pragmas with parallel_for

* [CPU, Parallel] Decrease number of calls to task function

* c[CPU, Parallel] Modify calls to new interface of parallel_for

f5183820

01 Sep, 2021 2 commits

[Feature] enable to specify stream in UnitGraph::CopyTo() which could lead to async copy (#3297) · 5a245104
Rhett Ying authored Sep 01, 2021
```
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
```
5a245104

[Feature] Add a HINT for the per edge type sampler of heterogeneous DistGraph... · f4fe518f

xiang song(charlie.song) authored Sep 01, 2021


[Feature] Add a HINT for the per edge type sampler of heterogeneous DistGraph that highlighting the etypes are sorted already. (#3260)

* pass cpp test

* distgraph use sorted edge flag.

* lint

* triger

* update test
Co-authored-by: Ubuntu <ubuntu@ip-172-31-2-66.ec2.internal>

f4fe518f

31 Aug, 2021 1 commit

[CPU][Sampling][Performance] Improve sampling on the CPU. (#3274) · 8e525dad

nv-dlasalle authored Aug 31, 2021



* Optimize sampling

* Stop initialization of array

* Fix includes for linting

* Move comment

* Fix replace
Co-authored-by: Da Zheng <zhengda1936@gmail.com>

8e525dad

24 Aug, 2021 1 commit
- fix (#3286) · 85b8fe52
  Quan (Andy) Gan authored Aug 24, 2021
  
  85b8fe52
20 Aug, 2021 1 commit

[Feature][DistDGL] Add NCCL support for range based partitions (#3213) · 7f927939

nv-dlasalle authored Aug 19, 2021

* Implement range based NDArrayPartition

* Finish implement range based partition support

* Add unit test

* Fix whitepace

* Add Kernel suffix

* Fix argument passing

* Add doxygen docs and improve variable naming

* Add unit test

* Add function for converting a partition book

* Add example to partition_op docs

* Fix dtype conversion for mxnet and tensorflow

7f927939

19 Aug, 2021 1 commit

[Performance][Feature] Implement edge excluding in EdgeDataLoader on GPU (#3226) · f6349508

nv-dlasalle authored Aug 19, 2021



* Update filter code

* Add unit tests

* Fixes

* Switch to indices

* Rename functions

* Fix linting

* Fix whitespace

* Add doc

* Fix heterograph

* Change workspace allocation

* Fix linting

* Fix docs in filter.py

* Add todo
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

f6349508

18 Aug, 2021 1 commit
- fix cuda 11.1 crashing bug (#3265) · a536772e
  Quan (Andy) Gan authored Aug 18, 2021
  
  a536772e
17 Aug, 2021 1 commit

[Performance] Cacheline-aligned access for UnifiedTensor (#3254) · 2613f7f0

David Min authored Aug 17, 2021



* Add pytorch-direct version

* remove

* add documentation for UnifiedTensor

* Revert "add documentation for UnifiedTensor"

This reverts commit 63ba42644d4aba197c1cb4ea4b85fa1bc43b8849.

* alignment fix for UnifiedTensor access

* fix linting issue
Co-authored-by: shhssdm <shhssdm@gmail.com>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

2613f7f0

08 Aug, 2021 1 commit
- [bugfix] Only link the pytorch tensordispatcher against libtorch (#3225) · 799c091e
  nv-dlasalle authored Aug 07, 2021
```
* Only link tensordispatcher against pytorch

* Only modify libraries when not using MSVC
```
  799c091e
02 Aug, 2021 1 commit

[bugfix] Fix curand_init() calls in rowwise sampling (#3196) · f7ce2671

nv-dlasalle authored Aug 02, 2021



* Split out separate generators for each thread

* Amortize cost of curand_init

* Improve readability
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

f7ce2671

28 Jul, 2021 1 commit

[New Feature] Per edge type sampler for to_homogeneous graphs. (#3131) · ba7e7cf9

xiang song(charlie.song) authored Jul 28, 2021



* fix.

* fix.

* fix.

* fix.

* Fix test

* Deprecate old DistEmbedding impl, use synchronized embedding impl

* Basic imple of heterogeneous on homogenenous sampling

* make pass

* Pass C++ test

* Add python test code

* lint

* lint

* Add MultiLayerEtypeNeighborSampler

* Add unitest for single machine dataloader

* Add dist dataloader test for edge type sampler

* Fix lint

* fix

* support for per etype sample

* Fix some bug and enable distributed training with per edge sample

* fix

* Now distributed training works

* turn off some mxnet

* turn off mxnet for some dist test

* fix

* upd

* upd according to the comments

* Fix

* Fix test and now distributed works.

* upd

* upd

* Fix

* Fix bug

* remove dead code.

* upd

* Fix

* upd

* Fix
Co-authored-by: Ubuntu <ubuntu@ip-172-31-71-112.ec2.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-2-66.ec2.internal>
Co-authored-by: Da Zheng <zhengda1936@gmail.com>

ba7e7cf9

25 Jul, 2021 1 commit

[Bugfix] fix potential starving in socket receiver (#3176) · c8d4d6fb

Jingcheng Yu authored Jul 25, 2021


Co-authored-by: JingchengYu94 <jingchengyu94@gmail.com>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

c8d4d6fb

21 Jul, 2021 1 commit
- Remove redundant fill in SPMM kernel (#3166) · de174ada
  Jinjing Zhou authored Jul 21, 2021
```
* remove redundant fill

* trigger ci
```
  de174ada
20 Jul, 2021 1 commit
- speed up random walks (#3158) · ddc92f8d
  Quan (Andy) Gan authored Jul 20, 2021
```
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
```
  ddc92f8d
16 Jul, 2021 1 commit

[Feature][Performance][GPU] Introducing UnifiedTensor for efficient zero-copy... · 905c0aa5

David Min authored Jul 17, 2021

[Feature][Performance][GPU] Introducing UnifiedTensor for efficient zero-copy host memory access from GPU (#3086)

* Add pytorch-direct version

* Initial commit of unified tensor

* Merge branch 'master' of https://github.com/davidmin7/dgl



* Remove unnecessary things

* Fix error message

* Fix/Add descriptions

* whitespace fix

* add unpin

* disable IndexSelectCPUFromGPU with no CUDA

* add a newline for unified_tensor.py

* Apply changes based on feedback

* add 'os' module

* skip unified tensor unit test for cpu only

* Update tests/pytorch/test_unified_tensor.py
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

* reflect feedback
Co-authored-by: shhssdm <shhssdm@gmail.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

905c0aa5

13 Jul, 2021 1 commit

[CPU][Kernel] Single socket spmm (#3024) · fac75e16

sanchit-misra authored Jul 13, 2021



* optimizations of spmm for CPU

* Added names of contributors

* Minor code cleanup

* Moved the spmm optimization code to a new header file

* Moved to DGL's logging method

* removed duplicate code between SpMMSumCsr and SpMMCmpCsr

* Changes made to follow Google coding style

* Fixed lint errors in spmm.h

* Fixed some lint errors from spmm_blocking_libxsmm.h

* Fixed lint errors from spmm_blocking_libxsmm.h

* Added comments to SpMMCreateLibxsmmKernel

* to enable building of tests, and other cosmetic changes

* disabling libxsmm on windows

* Put a condition to avoid opt impl for FP64 as libxsmm does not have FP64 support yet

* cosmetic changes and documentation

* cosmetic changes

* to pass lint tests

* replaced multiple allocations for buffers of indices and edges with a single allocation
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

fac75e16

08 Jul, 2021 2 commits
- fix cub problem (#3121) · 183d29de
  Jinjing Zhou authored Jul 08, 2021
  
  183d29de
- [Build] fix various build problems (#3117) · 2f41fcd9
  Quan (Andy) Gan authored Jul 08, 2021
  
  2f41fcd9
06 Jul, 2021 1 commit

[Feature] Add Heterograph support on Python for builtin unary msg functions... · 188152b8

Israt Nisa authored Jul 06, 2021


[Feature] Add Heterograph support on Python for builtin unary msg functions (copy_u, copy_e) (#2989)

* heterograph for binary func

* Added SDDMM support

* Added unittest

* added binary test cases

* unary mfuncs works

* Fixed lint err

* lint check and others

* link check

* fixed import *_hetero issue

* lint check

* replace torch with dgl backend

* lint cehck

* removed torch from test

* skip mxnet unittest

* skip gpu test

* Remove unused/duplicated code

* minor

* changed data structure of ndata and edata

* link check

* reorganized

* minor lint

* minor lint

* raise error for udf func

* lint check

* fix for CUDA 10.1

* add a note for future cross-type max/min reducing

* Add support CUDA < 11

* lint check

* tidied C code

* remove dummy GSDDMM_hetero backward implementation
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: Quan Gan <coin2028@hotmail.com>

188152b8

02 Jul, 2021 2 commits

fix curand (#3077) · 4e74dc86
Quan (Andy) Gan authored Jul 02, 2021
```
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
```
4e74dc86

[Feature] Add dgl.utils.is_sorted_srcdst() (#2685) · a0390dde

nv-dlasalle authored Jul 01, 2021



* Add dgl.utils.is_sorted_srcdst

* Fix linting issues

* delete blank line

* Specify datatype to index tensor in test

* Force integer conversion
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

a0390dde

27 Jun, 2021 1 commit

[Build] Make nccl optional (#3056) · 9664cdff

Jinjing Zhou authored Jun 27, 2021

* fix

* remove nvidiasmi

* fix

* fix docs

* fix

* fix

* 1

* fix

* remove

* skip deprecated kernel

* fix

* Revert "skip deprecated kernel"

This reverts commit c5ceb7f60dbbaf065b81cc3680757fd611d90ad3.

* fix

9664cdff

25 Jun, 2021 1 commit

[Feature] Support direct creation from CSR and CSC (#3045) · acd21a6d

Quan (Andy) Gan authored Jun 25, 2021



* csr and csc creation

* fix

* fix

* fixes to adj transpose

* fine

* raise error if indptr did not match number of nodes

* fix

* huh?

* oh
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

acd21a6d

23 Jun, 2021 3 commits

[Feature] Biased Neighbor Sampling (#2987) · e56bbafd

Qidong Su authored Jun 23, 2021



* update

* update

* update

* update

* lint

* lint

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* lint

* update

* clone

* update

* update

* update

* update

* replace idarray with ndarray

* refactor cpp part

* refactor python part

* debug

* refactor interface

* test and doc

* lint and test

* lint

* fix

* fix

* fix

* const

* doc

* fix

* fix

* fix

* fix

* fix & doc

* fix

* fix

* update

* update

* update

* merge

* doc

* doc

* lint

* fix

* more tests

* doc

* fix

* fix

* update

* update

* update

* fix

* fix
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

e56bbafd

[Bugfix] Handle case where process has no elements to update, in NCCL communicator (#3035) · 7415eaa5
nv-dlasalle authored Jun 23, 2021
```
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
```
7415eaa5

[Feature] Node2vec (#2992) · e667545d

Quan (Andy) Gan authored Jun 23, 2021



* add seal example

* 1. add paper infomation in examples/README
2. adjust codes
3. option test

* use latest `to_simple` to replace coalesce graph function

* remove outdated codes

* remove useless comment

* Node2vec
1.implement node2vec random walk c++ op
2.implement node2vec model
3.implement node2vec example

* add CMakeLists file modify

* refine c++ codes

* refine c++ codes

* add missing whitespace

* refine python codes

* add codes

* add node2vec_impl.h

* fix codes

* fix code style problem

* fixes

* remove

* lots of changes

* add benchmark

* fixes
Co-authored-by: smilexuhc <smile.xuhc@gmail.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

e667545d

22 Jun, 2021 1 commit

[Kernel] Add heterograph support in CUDA kernels (SpMM, SDDMM) (#2925) · 1113f674

Israt Nisa authored Jun 21, 2021



* Added heterograph support SpMM, SDDMM

* bug fix cuda stream

* add cudaStrm destroy and fix whitespace

* Added heterograph support SpMM, SDDMM

* bug fix cuda stream

* add cudaStrm destroy and fix whitespace

* changed max stream = 1

* Fixed ctx

* using default stream

* Added heterograph support SpMM, SDDMM

* bug fix cuda stream

* add cudaStrm destroy and fix whitespace

* changed max stream = 1

* Fixed ctx

* using default stream

* fix bug in copy_rhs

* changed by mistake

* minor datatype change

* added datatype check
Co-authored-by: Israt Nisa <nisisrat@amazon.com>

1113f674

21 Jun, 2021 1 commit

[API] Standardize Subgraph APIs (#2929) · ff519f98

Mufei Li authored Jun 21, 2021



* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Fix

* Update

* Fix subgraph tests

* Capture stdout for distributed test

* Capture stdout for distributed test

* Update

* Update

* Update

* Update subgraph.cc
Co-authored-by: Ubuntu <ubuntu@ip-172-31-28-17.us-west-2.compute.internal>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

ff519f98

16 Jun, 2021 1 commit

[Distributed] Support hierarchical partitioning (#3000) · aaec3d8a

Da Zheng authored Jun 16, 2021



* add.

* fix.

* fix.

* fix.

* fix.

* add tests.

* support node split and edge split.

* support 1 partition.

* add tests.

* fix.

* fix test.

* use hierarchical partition.

* add check.
Co-authored-by: Zheng <dzzhen@3c22fba32af5.ant.amazon.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-22-57.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-71-112.ec2.internal>

aaec3d8a

15 Jun, 2021 1 commit

[Feature] Add NN-descent support for the KNN graph function in dgl (#2941) · 64d0f3f3

Tianqi Zhang (张天启) authored Jun 15, 2021



* add bruteforce impl

* add nn descent implementation

* change doc-string

* remove redundant func

* use local rng for cuda

* fix lint

* fix lint

* fix bug

* fix bug

* wrap nndescent_knn_graph into knn

* fix lint

* change function names

* add comment for dist funcs

* let the compiler do the unrolling

* use better blocksize setting

* remove redundant line

* check the return of the cub calls
Co-authored-by: Tong He <hetong007@gmail.com>

64d0f3f3

13 Jun, 2021 1 commit

[Performance] Perform to_block on the GPU when the dataloader is created with... · 8b64ae59

nv-dlasalle authored Jun 13, 2021


[Performance] Perform to_block on the GPU when the dataloader is created with a GPU `device`. (#3016)

* add output device for dataloading

* Update dataloader

* Get sampler device from dataloader

* Fix line length

* Update examples

* Fix to_block GPU for empty relation types

* Handle the case where the DistGraph has None for the underlying graph
Co-authored-by: Da Zheng <zhengda1936@gmail.com>

8b64ae59

11 Jun, 2021 2 commits

[CPU, Parallel] parallel_for with default grain size (#3004) · 411bef54
Tomasz Patejko authored Jun 11, 2021

411bef54

[Feature] Allow using NCCL for communication in dgl.NodeEmbedding and dgl.SparseOptimizer (#2824) · 17d604b5

nv-dlasalle authored Jun 10, 2021



* Split from NCCL PR

* Fix type in comment

* Expand documentation for sparse_all_to_all_push

* Restore previous behavior in example

* Re-work optimizer to use NCCL based on gradient location

* Allow for running with embedding on CPU but using NCCL for gradient exchange

* Optimize single partition case

* Fix pylint errors

* Add missing include

* fix gradient indexing

* Fix line continuation

* Migrate 'first_step'

* Skip tests without enough GPUs to run NCCL

* Improve empty tensor handling for pytorch 1.5

* Fix indentation

* Allow multiple NCCL communicator to coexist

* Improve handling of empty message

* Update python/dgl/nn/pytorch/sparse_emb.py
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

* Update python/dgl/nn/pytorch/sparse_emb.py
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

* Keepy empty tensor dimensionaless

* th.empty -> th.tensor

* Preserve shape for empty non-zero dimension tensors

* Use shared state, when embedding is shared

* Add support for gathering an embedding

* Fix typo

* Fix more typos

* Fix backend call

* Use NodeDataLoader to take advantage of ddp

* Update training script to share memory

* Only squeeze last dimension

* Better handle empty message

* Keep embedding on the target device GPU if dgl_sparse if false in RGCN example

* Fix typo in comment

* Add asserts

* Improve documentation in example
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

17d604b5