Commits · 7415eaa5faf665d1d57e8c00d1648a2d4fc4531e · OpenDAS / dgl

23 Jun, 2021 2 commits

[Bugfix] Handle case where process has no elements to update, in NCCL communicator (#3035) · 7415eaa5
nv-dlasalle authored Jun 23, 2021
```
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
```
7415eaa5

Quan (Andy) Gan authored Jun 23, 2021



* add seal example

* 1. add paper infomation in examples/README
2. adjust codes
3. option test

* use latest `to_simple` to replace coalesce graph function

* remove outdated codes

* remove useless comment

* Node2vec
1.implement node2vec random walk c++ op
2.implement node2vec model
3.implement node2vec example

* add CMakeLists file modify

* refine c++ codes

* refine c++ codes

* add missing whitespace

* refine python codes

* add codes

* add node2vec_impl.h

* fix codes

* fix code style problem

* fixes

* remove

* lots of changes

* add benchmark

* fixes
Co-authored-by: smilexuhc <smile.xuhc@gmail.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

e667545d

22 Jun, 2021 1 commit

[Kernel] Add heterograph support in CUDA kernels (SpMM, SDDMM) (#2925) · 1113f674

Israt Nisa authored Jun 21, 2021



* Added heterograph support SpMM, SDDMM

* bug fix cuda stream

* add cudaStrm destroy and fix whitespace

* Added heterograph support SpMM, SDDMM

* bug fix cuda stream

* add cudaStrm destroy and fix whitespace

* changed max stream = 1

* Fixed ctx

* using default stream

* Added heterograph support SpMM, SDDMM

* bug fix cuda stream

* add cudaStrm destroy and fix whitespace

* changed max stream = 1

* Fixed ctx

* using default stream

* fix bug in copy_rhs

* changed by mistake

* minor datatype change

* added datatype check
Co-authored-by: Israt Nisa <nisisrat@amazon.com>

1113f674

21 Jun, 2021 1 commit

[API] Standardize Subgraph APIs (#2929) · ff519f98

Mufei Li authored Jun 21, 2021



* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Fix

* Update

* Fix subgraph tests

* Capture stdout for distributed test

* Capture stdout for distributed test

* Update

* Update

* Update

* Update subgraph.cc
Co-authored-by: Ubuntu <ubuntu@ip-172-31-28-17.us-west-2.compute.internal>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

ff519f98

16 Jun, 2021 1 commit

[Distributed] Support hierarchical partitioning (#3000) · aaec3d8a

Da Zheng authored Jun 16, 2021



* add.

* fix.

* fix.

* fix.

* fix.

* add tests.

* support node split and edge split.

* support 1 partition.

* add tests.

* fix.

* fix test.

* use hierarchical partition.

* add check.
Co-authored-by: Zheng <dzzhen@3c22fba32af5.ant.amazon.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-22-57.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-71-112.ec2.internal>

aaec3d8a

15 Jun, 2021 1 commit

[Feature] Add NN-descent support for the KNN graph function in dgl (#2941) · 64d0f3f3

Tianqi Zhang (张天启) authored Jun 15, 2021



* add bruteforce impl

* add nn descent implementation

* change doc-string

* remove redundant func

* use local rng for cuda

* fix lint

* fix lint

* fix bug

* fix bug

* wrap nndescent_knn_graph into knn

* fix lint

* change function names

* add comment for dist funcs

* let the compiler do the unrolling

* use better blocksize setting

* remove redundant line

* check the return of the cub calls
Co-authored-by: Tong He <hetong007@gmail.com>

64d0f3f3

13 Jun, 2021 1 commit

[Performance] Perform to_block on the GPU when the dataloader is created with... · 8b64ae59

nv-dlasalle authored Jun 13, 2021


[Performance] Perform to_block on the GPU when the dataloader is created with a GPU `device`. (#3016)

* add output device for dataloading

* Update dataloader

* Get sampler device from dataloader

* Fix line length

* Update examples

* Fix to_block GPU for empty relation types

* Handle the case where the DistGraph has None for the underlying graph
Co-authored-by: Da Zheng <zhengda1936@gmail.com>

8b64ae59

11 Jun, 2021 2 commits

[CPU, Parallel] parallel_for with default grain size (#3004) · 411bef54
Tomasz Patejko authored Jun 11, 2021

411bef54

[Feature] Allow using NCCL for communication in dgl.NodeEmbedding and dgl.SparseOptimizer (#2824) · 17d604b5

nv-dlasalle authored Jun 10, 2021



* Split from NCCL PR

* Fix type in comment

* Expand documentation for sparse_all_to_all_push

* Restore previous behavior in example

* Re-work optimizer to use NCCL based on gradient location

* Allow for running with embedding on CPU but using NCCL for gradient exchange

* Optimize single partition case

* Fix pylint errors

* Add missing include

* fix gradient indexing

* Fix line continuation

* Migrate 'first_step'

* Skip tests without enough GPUs to run NCCL

* Improve empty tensor handling for pytorch 1.5

* Fix indentation

* Allow multiple NCCL communicator to coexist

* Improve handling of empty message

* Update python/dgl/nn/pytorch/sparse_emb.py
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

* Update python/dgl/nn/pytorch/sparse_emb.py
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

* Keepy empty tensor dimensionaless

* th.empty -> th.tensor

* Preserve shape for empty non-zero dimension tensors

* Use shared state, when embedding is shared

* Add support for gathering an embedding

* Fix typo

* Fix more typos

* Fix backend call

* Use NodeDataLoader to take advantage of ddp

* Update training script to share memory

* Only squeeze last dimension

* Better handle empty message

* Keep embedding on the target device GPU if dgl_sparse if false in RGCN example

* Fix typo in comment

* Add asserts

* Improve documentation in example
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

17d604b5

10 Jun, 2021 2 commits

[Kernel] Slicing Batched Graphs (#2965) · 5be937a7

Mufei Li authored Jun 10, 2021



* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Add files via upload

* Add files via upload

* Add files via upload

* Add files via upload

* Update

* Update

* Add files via upload

* Add files via upload

* Update

* Lint

* Add files via upload

* Lint

* Update

* Update

* Update

* Update

* Update

* Lint Fix

* Lint
Co-authored-by: Ubuntu <ubuntu@ip-172-31-12-161.us-west-2.compute.internal>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

5be937a7

[Performance][Kernel] Eliminating ctor&dtor zand IsNullArray overhead in random walks (#2990) · ba154924

Ajay Brahmakshatriya authored Jun 09, 2021



* Added a special implementation for MetapathBasedRandomWalkStep for the Uniform randomwalk case

* Fixed all linting issues

* add comment
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
Co-authored-by: Da Zheng <zhengda1936@gmail.com>

ba154924

03 Jun, 2021 1 commit

Add heterograph support in C kernels (#2882) · 75ec5826

Israt Nisa authored Jun 03, 2021



* SpMM for heterograph

* C APIs SDDMM heterograph

* passes initial result

* renamed eid with nid

* aggregation on same ntype for multiple etypes

* fix link check failure

* lint check part 2

* lint check part 3

* Fixed SpMMCmpCsr Min op

* added mem references

* fixed fill(Max/Min), added const

* removed newline

* brought back docstring
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: Da Zheng <zhengda1936@gmail.com>

75ec5826

01 Jun, 2021 1 commit

[Feature][Sampler] Sort CSR by tag (#1664) · b8fe2b48

Qidong Su authored Jun 01, 2021



* update

* update

* update

* update

* lint

* lint

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* lint

* update

* clone

* update

* update

* update

* update

* replace idarray with ndarray

* refactor cpp part

* refactor python part

* debug

* refactor interface

* test and doc

* lint and test

* lint

* fix

* fix

* fix

* const

* doc

* fix

* fix

* fix

* fix

* fix & doc

* fix

* fix

* fix

* fix

* fix

* fix

* update
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

b8fe2b48

28 May, 2021 1 commit

Speed up random number generation. (#2953) · 9a0e13ac

Da Zheng authored May 28, 2021



* speed up random number generation.

* fix lint.

* Fix

* fix.
Co-authored-by: Zheng <dzzhen@3c22fba32af5.ant.amazon.com>

9a0e13ac

25 May, 2021 1 commit

Move pointer dereferencing in CDFSampler::draw() and AliasSampler::draw() to... · 1db4ad4f

nv-dlasalle authored May 24, 2021


Move pointer dereferencing in CDFSampler::draw() and AliasSampler::draw() to inside of conditional (#2943)
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

1db4ad4f

20 May, 2021 1 commit

[Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings... · ae8dbe6d

nv-dlasalle authored May 20, 2021


[Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings and sparse gradients. (#2825)

* Split NCCL wrapper from sparse optimizer and sparse embedding

* Add more unit tests for single node nccl

* Fix unit test for tf

* Switch to device histogram

* Fix histgram issues

* Finish migration to histogram

* Handle cases with zero send/recieve data

* Start on partition object

* Get compiling

* Updates

* Add unit tests

* Switch to partition object

* Fix linting issues

* Rename partition file

* Add python doc

* Fix python assert and finish doxygen comments

* Remove stubs for range based partition to satisfy pylint

* Wrap unit test in GPU only

* Wrap explicit cuda call in ifdef

* Merge with partition.py

* update docstrings

* Cleanup partition_op

* Add Workspace object

* Switch to using workspace object

* Move last remainder based function out of nccl_api

* Add error messages

* Update docs with examples

* Fix linting erros
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

ae8dbe6d

19 May, 2021 1 commit

[Feature] Add bruteforce implementation for KNN with O(Nk) space complexity (#2892) · 5d7e80f4

Tianqi Zhang (张天启) authored May 19, 2021



* add bruteforce impl

* add support for bruteforce-sharemem

* modify python API

* add tests

* change file path

* change python API

* fix lint

* fix test

* also check worst_dist in the last few dim

* use heap and early-stop on CPU

* fix lint

* fix lint

* add device check

* use cuda function to determine max shared mem

* use cuda to determine block info

* add memory free for tmp var

* update doc-string and add dist option

* fix lint

* add more tests
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

5d7e80f4

18 May, 2021 1 commit

[Distributed] add distributed in-degree and out-degree. (#2918) · 6e7f19f2

Da Zheng authored May 18, 2021



* add distributed in-degree and out-degree.

* update comments.

* fix a bug.

* add tests.

* add tests.

* fix a bug.

* fix docstring.

* update doc.

* fix

* fix.
Co-authored-by: Zheng <dzzhen@3c22fba32af5.ant.amazon.com>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

6e7f19f2

17 May, 2021 1 commit

[Feature] Python interface for adjacency matrix summation and multiplication (#2893) · 657c220d

Quan (Andy) Gan authored May 17, 2021

* test commit

* fixes

* oops

* add docs

* lint

* why does it say I have a trailing whitespace

* oh ok

* fixes

* why there's an invalid argument error

* address comments

* fix

* address comments

657c220d

28 Apr, 2021 1 commit

Fix cu11 compile (#2879) · 703d4b93

xiang song(charlie.song) authored Apr 28, 2021


Co-authored-by: Ubuntu <ubuntu@ip-172-31-1-191.ec2.internal>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

703d4b93

27 Apr, 2021 1 commit

[Feature] Add cuda support for Sparse Matrix multiplication, summation and masking (#2782) · ab2bd1f1

Israt Nisa authored Apr 27, 2021



* init cuda support

* cuSPARSE err

* passed unittest for csr_mm/SpGEMM. int64 not supported

* Debugging cuSPARSE error 3

* csrgeam only supports int32?

* disabling int64 for cuda

* refactor and add CSRMask

* lint

* oops

* remove todo

* rewrite CSRMask with CSRGetData

* lint

* fix test

* address comments

* lint

* fix

* addresses comments and rename BUG_ON
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-30-71.ec2.internal>
Co-authored-by: Quan Gan <coin2028@hotmail.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

ab2bd1f1

25 Apr, 2021 1 commit
- [Serialization] Better error message when idx_list out of bound (#2848) · 66182b28
  Jinjing Zhou authored Apr 25, 2021
```
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
```
  66182b28
22 Apr, 2021 1 commit

[Sampler] BiasedChoice sampler (#1665) · 6b022d2f

Qidong Su authored Apr 22, 2021



* update

* update

* update

* update

* update

* update

* update

* fix

* fix

* update

* doc

* doc

* fix

* fix
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

6b022d2f

16 Apr, 2021 1 commit

[Performance] Track sorted status of COO from creation (#2645) · bbebde46

nv-dlasalle authored Apr 15, 2021



* Add row/col sorted flags

* improve sorting paths

* Remove print statement

* Keep track of sorted matrices

* Remove sort check in to_block

* Improve CPU sorted COO->CSR

* Handle the zero edge case

* Remove omp default clause to work with MSVC

* Update comments on sorted COO->CSR cpu implementatoin

* Expose sorted to python interface

* Make check_sorted default to false for dgl.graph()

* remove check sorted; add utests

* remove check_sorted flag
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

bbebde46

15 Apr, 2021 1 commit

[Performance][GPU] Enable GPU uniform edge sampling (#2716) · e70138bb

nv-dlasalle authored Apr 14, 2021



* Start on uniform GPU sampling

* Save more work

* Get cu file compiling

* Update sampling

* More changes

* Get GPU sampling for uniform probabilities solved

* Fix batch tensor migration

* Fix

* update kernels

* expand blocking

* Undo testing change

* Cut down on sampling overhead

* Fix replacement

* Update unit tests

* Add option to gpu sample in graphsage

* Copy only csc to gpu

* Add ogbn support

* Fix linting

* Remove nvtx from sample

* Improve documentation and error checking

* Expand documentation

* Update assert checking

* delete extra space

* Use standard dataloader when dataset is a dictionary

* ogb -> ogbn

* Fix edge selection determinism

* Fix typos

* Remove nvtx

* Add comment for self.fanout_arrays and assert

* Fix linting

* Migrate to scalarbatcher

* Fix indentation

* Fix batcher

* Fix indexing

* Only use databatcher for GPU

* Convert to DGL NDArray to PyTorch Tensor

* Add optimization for PyTorch's F.tensor() for list of GPU tensors
Co-authored-by: Da Zheng <zhengda1936@gmail.com>

e70138bb

09 Apr, 2021 1 commit

[Feature] Add kd-tree implementation (CPU) for kNN (#2767) · e83d0a80

Tianqi Zhang (张天启) authored Apr 09, 2021



* add submodule nanoflann

* finish python API for knn

* finish ndarray adaptor

* finish cpu-kdtree version of knn

* use openmp

* add endline

* upt

* upt

* fix format and code style

* upt

* add warning for gpu-cpu copy

* avoid contiguous copy
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
Co-authored-by: Tong He <hetong007@gmail.com>

e83d0a80

05 Apr, 2021 1 commit
- [Performance] Prefer parallelized conversion to CSC from COO instead of transposing CSR (#2793) · 05c53ca3
  Quan (Andy) Gan authored Apr 05, 2021
```
* fix coo2csr speed

* add comments
```
  05c53ca3
01 Apr, 2021 1 commit

[Performance] Linear UniformChoice optimization (#2710) · b2e35e6a

pawelpiotrowicz authored Apr 01, 2021


Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Zihao Ye <expye@outlook.com>

b2e35e6a

25 Mar, 2021 1 commit
- [Bug] Disable cpu fp16 (#2783) · 0b57ce18
  Quan (Andy) Gan authored Mar 25, 2021
```
* disable cpu fp16

* spell mistakes
```
  0b57ce18
24 Mar, 2021 1 commit

[Feature] Sparse-sparse matrix multiplication, addition, and masking (#2753) · 929d8634

Quan (Andy) Gan authored Mar 24, 2021

* test

* more stuff

* add test

* fixes

* optimize algo

* replace unordered_map with arrays

* lint

* lint x2

* oops

* disable gpu csrmm tests

* remove gpu invocation

* optimize with openmp

* remove python functions

* add back with docstrings

* lint

* lint

* update python interface

* functionize

* functionize

* lint

* lint

929d8634

22 Mar, 2021 2 commits

[Bugfix] Wrap cub with CUB_NS_PREFIX and remove dependency on Thrust to... · 0ff7127a

nv-dlasalle authored Mar 22, 2021


[Bugfix] Wrap cub with CUB_NS_PREFIX and remove dependency on Thrust to linking issues with Torch 1.8 (#2758)

* Wrap cub with prefixes and remove thrust

* Using counting iterator
Co-authored-by: Zihao Ye <expye@outlook.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

0ff7127a

Print error messages when using TCP socket (#2763) · 74c38a1f
Da Zheng authored Mar 21, 2021
```
* print error messages.

* fix.
```
74c38a1f

09 Mar, 2021 1 commit

[Feature] Add edge coarsening for homogeneous undirected graphs (#2691) · c88fca50

Tianqi Zhang (张天启) authored Mar 09, 2021



* finish graph matching gpu version

* use C++ shuffle

* finish graph matching

* fix bug

* fix bug

* change name and use swap

* upt

* fix format problem

* fix format problem

* stronger test

* upt

* upt

* change python api

* upt

* upt

* format check

* upt

* upt

* fix bug
Co-authored-by: Tong He <hetong007@gmail.com>

c88fca50

05 Mar, 2021 1 commit
- fix doc typo (#2721) · 62dd1c86
  maqy1995 authored Mar 05, 2021
  
  62dd1c86
21 Feb, 2021 1 commit

[Feature] Support aggregate multiple edge features in to_simple. (#2623) · e6bf54cd

Zihao Ye authored Feb 21, 2021

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* fix

* refactor

* upd test

* large feat_len or n in segment reduce

* lint

e6bf54cd

08 Feb, 2021 1 commit

[Sampling] Implement `dgl.to_block()` for the GPU (#2339) · bc3a532f

nv-dlasalle authored Feb 07, 2021



* Add start of to_block gpu implementation

* Pull in more changes from 0.4.2 cuda_to_block

* Move more code to IdArray

* Refactor DeviceNodeMapMaker

* Updates

* get compiling

* Integrate to_block

* Fix ID allocation

* Minor fixes

* Cleanup cuda calls to use cuda_common

* Reduce kernel calls

* Lint cleanup

* Expand documentation

* Remove unused function

* Rename variables for consistency

* Add doxygen comments

* Fix file extension

* Remove raw asynccopy for deviceapi

* Remove unused function

* Fix block/tile configuration

* Add cuda_device_common.cuh

* Add basic hashtable

* Migrate part of hashtable

* Refactor to use external hashtable

* Make functions members

* Format hash table functions

* Migrate duplicate filling

* Move last function over

* Refactor with cu file

* lint c++ code

* Move context check to C++ code

* Use macro switch

* Add missing files

* Update docstring

* update docs

* Move atomic functions

* Refactor hashtable

* Fix linting

* Expand docs

* Fix mismatched argument names

* Switch doxygen comments from using @param to \param
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

bc3a532f

29 Jan, 2021 1 commit
- fix build problems (#2594) · 460bb42d
  Quan (Andy) Gan authored Jan 29, 2021
  
  460bb42d
28 Jan, 2021 1 commit

[feature] Supporting half precision floating data type (fp16). (#2552) · 7bab1365

Zihao Ye authored Jan 28, 2021



* add tvm as submodule

* compilation is ok but calling fails

* can call now

* pack multiple modules, change names

* upd

* upd

* upd

* fix cmake

* upd

* upd

* upd

* upd

* fix

* relative path

* upd

* upd

* upd

* singleton

* upd

* trigger

* fix

* upd

* count reducible

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* only keep related files

* upd

* upd

* upd

* upd

* lint

* lint

* lint

* lint

* pylint

* upd

* upd

* compilation

* fix

* upd

* upd

* upd

* upd

* upd

* upd

* upd doc

* refactor

* fix

* upd number
Co-authored-by: Zhi Lin <linzhilynn@gmail.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-42-78.us-east-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-21-156.us-east-2.compute.internal>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

7bab1365

27 Jan, 2021 2 commits

[Feature] Add support for sparse embedding (#2451) · a7e941c3

xiang song(charlie.song) authored Jan 28, 2021



* Add sparse embedding for dgl and update rgcn example

* upd

* Fix

* Revert "Fix"

This reverts commit 4da87cdfb8b8c3506b7fc7376cd2385ba8045c2a.

* Fix

* upd

* upd

* Fix

* Add unitest and update impl

* fix

* Clean up rgcn example code

* upd

* upd

* update

* Fix

* update score

* sparse for sage

* remove model sparse

* upd

* upd

* remove global norm

* revert delete model_sparse.py

* update according to comments

* Fix doc

* upd

* Fix test

* upd

* lint

* lint

* lint

* upd

* upd

* clean up
Co-authored-by: Ubuntu <ubuntu@ip-172-31-56-220.ec2.internal>

a7e941c3

[Performance] Improve COO to CSR, and sort columns of CSR only when necessary. (#2391) · 2576647c

nv-dlasalle authored Jan 26, 2021

* Remove double-checking sorted

* Remove sorting of CSR by default

* Update unit test to use unsorted matix

* delete whitespace

* Expand unit tests

* Replace cusparse sort

* Fix row column sorting

* Explicitly don't sort columns

* Fix linting errors

* Fix bit-width calculation

* Fix sorting assertion and unit test

* Fix linting

* Improve CPU COO2CSR

* Remove references

* Rename and add documentation to edge encoding/decoding funcionts

* Fix sorting keys as 64 bit

* Revert cosmetic changes to unit tests

* Update documentation

* Update complexity documentation for coo to csr conversion

* Remove COOIsSorted check in CPU implementation too

2576647c