Commits · 65b0b9e8c3161605b77841200a87d1a0ac4abefc · OpenDAS / dgl

16 May, 2022 1 commit
- [Peformance] Remove unnecessary induced vertices in EdgeSubgraph (#3978) · 03024f95
  Xin Yao authored May 16, 2022
```
* remove unnecessary induced vertices in EdgeSubgraph

* add unit test
```
  03024f95
12 May, 2022 1 commit
- Fix launch parameters index select kernel in sparse push (#3524) · 4177f729
  nv-dlasalle authored May 12, 2022
  
  4177f729
11 May, 2022 1 commit

[Dist] Enable maximum try times for socket backend via DGL_DIST_MAX_T… (#3977) · 22e218d3

Rhett Ying authored May 11, 2022

* [Dist] Enable maximum try times for socket backend via DGL_DIST_MAX_TRY_TIMES

* reset env before/after test

* print log for info when trying to connect

* fix

* print log in python instead of cpp

22e218d3

27 Apr, 2022 1 commit

[Feature] enable socket net_type for rpc (#3951) · 37be02a4

Rhett Ying authored Apr 28, 2022

* [Feature] enable socket net_type for rpc

* fix lint

* fix lint

* fix build issue on windows

* fix test failure on windows

* fix test failure

* fix cpp unit test failure

* net_type blocking max_try_times

* fix other comments

* fix lint

* fix comment

* fix lint

* fix cpp

37be02a4

26 Apr, 2022 1 commit

[Performance][GPU] Improving Disjoint Union kernel for Graph Dataloaders (#3895) · 6e46bbf5

ayasar70 authored Apr 26, 2022



* Based on issue #3436. Improving _SegmentCopyKernel s GPU utilization by switching to nonzero based thread assignment

* fixing lint issues

* Update cub for cuda 11.5 compatibility (#3468)

* fixing type mismatch

* tx guaranteed to be smaller than nnz. Hence removing last check

* minor: updating comment

* adding three unit tests for csr slice method to cover some corner cases

* timing repeatkernel

* clean

* clean

* clean

* updating _SegmentMaskColKernel

* Working on requests: removing sorted array check and adding comments to utility functions

* fixing lint issue

* Optimizing disjoint union kernel

* Trying to resolve compilation issue on CI

* [EMPTY] Relevant commit message here

* applying revision requests on cpu/disjoint_union.cc

* removing unnecessary casts

* remove extra space
Co-authored-by: Abdurrahman Yasar <ayasar@nvidia.com>
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

6e46bbf5

12 Apr, 2022 1 commit
- [Example] Cleaned GraphSAGE node classification example with PyTorch Lightning (#3863) · 0d878ff8
  Quan (Andy) Gan authored Apr 12, 2022
```
* cleaned pl node classification example

* conform to PL's method of updating the dataloader

* update

* lint

* fix test

* fix
```
  0d878ff8
11 Apr, 2022 1 commit

[Feature] Enable UVA for GPU PinSAGE and RandomWalk (#3857) · 5fcd7f29

Xin Yao authored Apr 11, 2022



* enable uva for pinsage sampler

* unit test

* modify some checks on the python side

* remove legacy random walk code

* update unit test

* update unit test

* fix unit test

* adjust checks

* move some checks to c++

* move max_nodes check to cuda kernel

* fix ci for tf
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>

5fcd7f29

09 Apr, 2022 1 commit

[BugFix] record/restore pin status when pickle/unpickle (#3914) · adb3a7c1

Rhett Ying authored Apr 09, 2022

* [BugFix] record/restore pin status when pickle/unpickle

* disable test on TF

* set version as expected

* unpin memory in test

adb3a7c1

05 Apr, 2022 1 commit

[Examples] Update graphsage multi-gpu example to use mutliple GPUs for... · 27a6eb56

nv-dlasalle authored Apr 05, 2022


[Examples] Update graphsage multi-gpu example to use mutliple GPUs for validation and testing. (#3827)

* Update graphsage multi-gpu example to use mutliple GPUs for validation and
testing.

* Remove argmax

* Fix rebase error

* Add more documentation to example and simplify

* Switch to name shared memory

* Add comment about how training is distributed

* Restore iteration count

* fix munmap error reporting for better error messages
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

27a6eb56

31 Mar, 2022 1 commit
- [Bugfix] Fix UVA sampling with partially specified node types (#3897) · 35e66f42
  Quan (Andy) Gan authored Mar 31, 2022
```
* fix uva with partial node types

* lint

* skip tensorflow unit test
```
  35e66f42
27 Mar, 2022 1 commit

[Feature] METIS Partition with Communication Volume Minimization (#3821) · fbbca994

Cheng Wan authored Mar 27, 2022

* upd

* upd

* upd

* upd

* upd

* fix OpenMP compatibility issues

* typo

* partition

* misc

* fix typo

* num_parts=1

* import torch

* long

* print info

* print info

* print info

* upd

* remove debug code

* revert partition.py

* fix cut count

* fix cut count

* Revert "fix cut count"

This reverts commit 10926b4fd48f45c8f1ddb58be7db6c22e653effd.

* Revert "fix cut count"

This reverts commit 76465283bef093a2b4209ad70dd15d2437b2ec8a.

* type of deprecate

* typo in deprecate info

* fix typo

* use cv for partitioning

* CE

* no message

* revert

* typo

* add objtype

* no message

* fix bug

* fix bug

* fix bug

* ?

* semicolon

* drop tensors

* no message

* backward

* backward

* max op

* store X.shape

* th

* test

* Revert "test"

This reverts commit 92b3b2f64a3a1128590098fa03ce429c5466e6ce.

* test

* tolist

* debug

* to cuda

* tuple

* fix bug

* remove X

* no message

* fix bug

* workload balance

* Revert "workload balance"

This reverts commit d7f8e4a16ba2a7eabb4a9bb945523bfe6623e723.

* reverse

* Revert "reverse"

This reverts commit 8a71cf25685aa7d889b9b8881b46f7a16b7d6e6d.

* Revert "Revert "reverse""

This reverts commit 196b143932d5cf9813576ece7c990b63d322d063.

* Revert "Revert "Revert "reverse"""

This reverts commit cf9e89a07013582056e7cde235e51331aca7fa9c.

* no message

* Merge commit '5498cf05'

# Conflicts:
#	python/dgl/distributed/partition.py

* Revert "Merge commit '5498cf05

'"

This reverts commit f79be2ad777897c7025b28308454cad81ad6bb27.

* fix bug

* third party

* no message

* try to avoid memory leak

* try to avoid memory leak

* avoid memory leak with no hope

* Revert "avoid memory leak with no hope"

This reverts commit c77befe9479f46758e744642f66dd209b50eef7d.

* no message

* Revert "no message"

This reverts commit 478cb28fe25fb1002b2f1dc202bb9bdaad8b2a56.

* del

* Revert "del"

This reverts commit 1b468e45ce646b400ff3ffa61a0b2da058b3bdfd.

* no message

* no message

* Revert "no message"

This reverts commit 92e4f5561ed42da0606618b2fff9f1ad5ed439d9.

* third party

* document

* Update metis_partition.cc

* Update metis_partition_hetero.cc

* Update metis_partition_hetero.cc

* Update partition.py

* Update partition.py

* Update partition.py
Co-authored-by: yzh119 <expye@outlook.com>
Co-authored-by: chwan-rice <54331508+chwan-rice@users.noreply.github.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Da Zheng <zhengda1936@gmail.com>

fbbca994

24 Mar, 2022 2 commits
- [Bugfix] Fix multiple bugs and code refactor (#3841) · 223a3da5
  Quan (Andy) Gan authored Mar 24, 2022
```
* fix

* remove setcxx methods

* move pin flag to CSR and COO matrix
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
```
  223a3da5
- [BugFix] send rpc messages blockingly in case of congestion (#3867) · e9fd65e9
  Rhett Ying authored Mar 24, 2022
```
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
```
  e9fd65e9
10 Mar, 2022 1 commit

Change the parameter of curand_init (#3794) · eec219ab

paoxiaode authored Mar 10, 2022



* Change the curand_init parameter

* Change the curand_init parameter

* commit

* commit
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>

eec219ab

01 Mar, 2022 1 commit
- [Build] Working around broken name mangling in MSVC 16.5.5 + CUDA 11.3 (#3790) · 396d7180
  Quan (Andy) Gan authored Mar 01, 2022
```
* fix

* explain

* oops
```
  396d7180
28 Feb, 2022 2 commits
- [Build] Split spmm.cu and sddmm.cu for building on Windows (#3789) · 3521fbe9
  Quan (Andy) Gan authored Mar 01, 2022
```
* split files

* fix
```
  3521fbe9
- [Build or bug?] Fix VS2019 compilation error in randomwalk GPU kernel (#3788) · 6e1c6990
  Quan (Andy) Gan authored Feb 28, 2022
```
* Update randomwalk_gpu.cu

* Update randomwalk_gpu.cu
```
  6e1c6990
27 Feb, 2022 1 commit

[Doc and bugfix] Add docs and user guide and update tutorial for sampling pipeline (#3774) · d41d07d0

Quan (Andy) Gan authored Feb 28, 2022



* huuuuge update

* remove

* lint

* lint

* fix

* what happened to nccl

* update multi-gpu unsupervised graphsage example

* replace most of the dgl.mp.process with torch.mp.spawn

* update if condition for use_uva case

* update user guide

* address comments

* incorporating suggestions from @jermainewang

* oops

* fix tutorial to pass CI

* oops

* fix again
Co-authored-by: Xin Yao <xiny@nvidia.com>

d41d07d0

23 Feb, 2022 2 commits

Fixes the bug when total_nnz is > integer limit (#3766) · e7ad4c9c
sanchit-misra authored Feb 24, 2022

e7ad4c9c

[NN] Rework RelGraphConv and HGTConv (#3742) · 0227ddfb

Minjie Wang authored Feb 23, 2022

* WIP: TypedLinear and new RelGraphConv

* wip

* further simplify RGCN

* a bunch of tweak for performance; add basic cpu support

* update on segmm

* wip: segment.cu

* new backward kernel works

* fix a bunch of bugs in kernel; leave idx_a for future

* add nn test for typed_linear

* rgcn nn test

* bugfix in corner case; update RGCN README

* doc

* fix cpp lint

* fix lint

* fix ut

* wip: hgtconv; presorted flag for rgcn

* hgt code and ut; WIP: some fix on reorder graph

* better typed linear init

* fix ut

* fix lint; add docstring

0227ddfb

21 Feb, 2022 1 commit

[Bugfix] Bug fixes in new dataloader (#3727) · 3f138eba

Quan (Andy) Gan authored Feb 22, 2022



* fixes

* fix

* more fixes

* update

* oops

* lint?

* temporarily revert - will fix in another PR

* more fixes

* skipping mxnet test

* address comments

* fix DDP

* fix edge dataloader exclusion problems

* stupid bug

* fix

* use_uvm option

* fix

* fixes

* fixes

* fixes

* fixes

* add evaluation for cluster gcn and ddp

* stupid bug again

* fixes

* move sanity checks to only support DGLGraphs

* pytorch lightning compatibility fixes

* remove

* poke

* more fixes

* fix

* fix

* disable test

* docstrings

* why is it getting a memory leak?

* fix

* update

* updates and temporarily disable forkingpickler

* update

* fix?

* fix?

* oops

* oops

* fix

* lint

* huh

* uh

* update

* fix

* made it memory efficient

* refine exclude interface

* fix tutorial

* fix tutorial

* fix graph duplication in CPU dataloader workers

* lint

* lint

* Revert "lint"

This reverts commit 805484dd553695111b5fb37f2125214a6b7276e9.

* Revert "lint"

This reverts commit 0bce411b2b415c2ab770343949404498436dc8b2.

* Revert "fix graph duplication in CPU dataloader workers"

This reverts commit 9e3a8cf34c175d3093c773f6bb023b155f2bd27f.
Co-authored-by: xiny <xiny@nvidia.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

3f138eba

18 Feb, 2022 2 commits

[Performance][GPU] Improving _SegmentMaskColKernel (#3745) · 7b9afbfa

ayasar70 authored Feb 18, 2022



* Based on issue #3436. Improving _SegmentCopyKernel s GPU utilization by switching to nonzero based thread assignment

* fixing lint issues

* Update cub for cuda 11.5 compatibility (#3468)

* fixing type mismatch

* tx guaranteed to be smaller than nnz. Hence removing last check

* minor: updating comment

* adding three unit tests for csr slice method to cover some corner cases

* timing repeatkernel

* clean

* clean

* clean

* updating _SegmentMaskColKernel

* Working on requests: removing sorted array check and adding comments to utility functions

* fixing lint issue
Co-authored-by: Abdurrahman Yasar <ayasar@nvidia.com>
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

7b9afbfa

DGL Enter (#3690) · 539335ce

Jinjing Zhou authored Feb 18, 2022



* add

* fix

* fix

* fix

* fix

* add

* add

* fix

* fix

* fix

* new loader

* fix

* fix

* fix for 3.6

* fix

* add

* add receipes and also some bug fixes

* fix

* fix

* fix

* fix receipies

* allow AsNodeDataset to work on ogb

* add ut

* many fixes for nodepred-ns pipeline

* receipe for nodepred-ns

* Update enter/README.md
Co-authored-by: Zihao Ye <zihaoye.cs@gmail.com>

* fix layers

* fix

* fix

* fix

* fix

* fix multiple issues

* fix for citation2

* fix comment

* fix

* fix

* clean up

* fix
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: Minjie Wang <minjie.wang@nyu.edu>
Co-authored-by: Zihao Ye <zihaoye.cs@gmail.com>

539335ce

15 Feb, 2022 1 commit

[Feature] Gather mm (#3641) · b3d3a2c4

Israt Nisa authored Feb 14, 2022



* init

* init

* working cublasGemm

* benchmark high-mem/low-mem, err gather_mm output

* cuda kernel for bmm like kernel

* removed cpu copy for E_per_Rel

* benchmark code from Minjie

* fixed cublas results in gathermm sorted

* use GPU shared mem in unsorted gather mm

* minor

* Added an optimal version of gather_mm_unsorted

* lint

* init gather_mm_scatter

* cublas transpose added

* fixed h_offset for multiple rel

* backward unittest

* cublas support to transpose W

* adding missed file

* forgot to add header file

* lint

* lint

* cleanup

* lint

* docstring

* lint

* added unittest

* lint

* lint

* unittest

* changed err type

* skip cpu test

* skip CPU code

* move in-len loop inside

* lint

* added check different dim length for B

* w_per_len is optional now

* moved gather_mm to pytorch/backend with backward support

* removed a_/b_trans support

* transpose op inside GEMM call

* removed out alloc from API, changed W 2D to 3D

* Added se_gather_mm, Separate API for sortedE

* Fixed gather_mm (unsorted) user interface

* unsorted gmm backward + separate CAPI for un/sorted A

* typecast to float to support atomicAdd

* lint typecast

* lint

* added gather_mm_scatter

* minor

* const

* design changes

* Added idx_a, idx_b support gmm_scatter

* dgl doc

* lint

* adding gather_mm in ops

* lint

* lint

* minor

* removed benchmark files

* minor

* empty commit
Co-authored-by: Israt Nisa <nisisrat@amazon.com>

b3d3a2c4

11 Feb, 2022 1 commit

New fused edge_softmax op (#3650) · bc8f8b0b

ranzhejiang authored Feb 11, 2022



* [feature] edge softmax refact.

* delete file

* fix backward and cmake version

* fix backward

* format function

* fix setting

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* add cuda kernel for backward and rename some function

* add benchmark for edge_softmax

* fix format

* remove cuda_backwrd

* fix code format and add comment for op on CPU

* fix lint
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

bc8f8b0b

09 Feb, 2022 1 commit

[Feature] CUDA UVA sampling for MultiLayerNeighborSampler (#3674) · 738e8318

Xin Yao authored Feb 09, 2022



* implement pin_memory/unpin_memory/is_pinned for dgl.graph

* update python docstring

* update c++ docstring

* add test

* fix the broken UnifiedTensor

* XPU_SWITCH for kDLCPUPinned

* a rough version ready for testing

* eliminate extra context parameter for pin/unpin

* update train_sampling

* fix linting

* fix typo

* multi-gpu uva sampling case

* disable new format materialization for pinned graphs

* update python doc for pin_memory_

* fix unit test

* UVA sampling for link prediction

* dispatch most csr ops

* update graphsage example to combine uva sampling and UnifiedTensor

* update graphsage example to combine uva sampling and UnifiedTensor

* update graphsage example to combine uva sampling and UnifiedTensor

* update doc

* update examples

* change unitgraph and heterograph's PinMemory to in-place

* update examples for multi-gpu uva sampling

* update doc

* fix linting

* fix cpu build

* fix is_pinned for DistGraph

* fix is_pinned for DistGraph

* update graphsage unsupervised example

* update doc for gpu sampling

* update some check for sampling device switching

* fix linting

* adapt for new dataloader

* fix linting

* fix

* fix some name issue

* adjust device check

* add unit test for uva sampling & fix some zero_copy bug

* fix linting

* update num_threads in graphsage examples
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

738e8318

26 Jan, 2022 1 commit

[Feature] long live server for multiple client groups (#3645) · 02e4cd8b

Rhett Ying authored Jan 26, 2022

* [Feature] long live server for multiple client groups

* generate globally unique name for DistTensor within DGL automatically

02e4cd8b

21 Jan, 2022 1 commit

[Feature] Pin dgl.graph to the page-locked memory (#3616) · 40b44a43

Xin Yao authored Jan 21, 2022



* implement pin_memory/unpin_memory/is_pinned for dgl.graph

* update python docstring

* update c++ docstring

* add test

* fix the broken UnifiedTensor

* eliminate extra context parameter for pin/unpin

* fix linting

* fix typo

* disable new format materialization for pinned graphs

* update python doc for pin_memory_

* fix unit test

* update doc

* change unitgraph and heterograph's PinMemory to in-place

* update comments for NDArray's PinMemory_ and PinData

* update doc
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

40b44a43

19 Jan, 2022 1 commit

[Fix] reduce error msg, refine fetch logic of available ports (#3658) · e4cb4a37

Rhett Ying authored Jan 19, 2022

* [Fix] reduce error msg, refine fetch logic of available ports

* un-initialize client before sending shutdown request

* fix import error

* print connect failure log only in debug mode

* enable DMLC_LOG_DEBUG=1 in CI

e4cb4a37

17 Jan, 2022 2 commits
- [Bugfix] Fixes the redundancy parameter being used wrong in global negative sampling (#3657) · 77f4287a
  Quan (Andy) Gan authored Jan 17, 2022
```
* oops

* test
```
  77f4287a
- [Bugfix] Fix GPU global negative sampling code (#3653) · 2aad1c0b
  Quan (Andy) Gan authored Jan 17, 2022
```
* fix GPU global negative sampling code

* Update negative_sampling.cu
```
  2aad1c0b
11 Jan, 2022 2 commits

Pass the std:min argument's type, to avoid the compilation error. (#3637) · b002f8f9

MaoYuan Xian authored Jan 11, 2022



* Pass the std:min argument's type, to avoid the compilation error.

* Update parallel_for.h

* Update negative_sampling.cc
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

b002f8f9

[Feature][Dist] change TP::Receiver/TP::Sender for multiple connections (#3574) · 37467e25

Rhett Ying authored Jan 11, 2022



* [Feature] enable TP::Receiver wait for any numbers of senders

* fix random unit test failure

* avoid endless future wait

* fix unit test failure

* fix seg fault when finalize wait in receiver

* [Feature] refactor sender connect logic and remove unnecessary sleeps in unit tests

* fix lint

* release RPCContext resources before process exits

* [Debug] TPReceiver wait start log

* [Debug] add log in get port

* [Debug] add log

* [ReDebug] revert time sleep in unit tests

* [Debug] remove sleep for test_distri,test_mp

* [debug] add more log

* [debug] add listen_booted_ flag

* [debug] restore commented code for queue

* [debug] sleep more in rpc_client

* restore change in tests

* Revert "restore change in tests"

This reverts commit 41a18926d181ec2517069389bfc41de2cc949280.

* Revert "[debug] sleep more in rpc_client"

This reverts commit a908e758eabca0a6ce62eb2e59baea02a840ac67.

* Revert "[debug] restore commented code for queue"

This reverts commit d3f993b3746e6bb6e2cc2f90204dd7e9461c6301.

* Revert "[debug] add listen_booted_ flag"

This reverts commit 244b2167d94942ff2a0acec8823b974975e52580.

* Revert "[debug] add more log"

This reverts commit 4b78447b0a575a824821dc7e25cca2246e6e30e2.

* Revert "[Debug] remove sleep for test_distri,test_mp"

This reverts commit e1df1aadcc8b1c2a0013ed77322ac391a8807612.

* remove debug code

* revert unnecessary change

* revert unnecessary changes

* always reset RPCContext when get started and reset all data

* remove time.sleep in dist tests

* fix lint

* reset envs before each dist test

* reset env properly

* add time sleep when start each server

* sleep for a while when boot server

* replace wait_thread with callback

* fix lint

* add dglconnect handshake check
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

37467e25

10 Jan, 2022 1 commit
- disabling cuda11 apis (#3635) · c04b5bc7
  Quan (Andy) Gan authored Jan 10, 2022
  
  c04b5bc7
07 Jan, 2022 1 commit

[Feature] Negative sampling (#3599) · 90f10b31

Quan (Andy) Gan authored Jan 07, 2022

* first commit

* a bunch of fixes

* add unique

* lint

* lint

* lint

* address comments

* Update negative_sampler.py

* fix

* description

* address comments and fix

* fix

* replace unique with replace

* test pylint

* Update negative_sampler.py

90f10b31

04 Jan, 2022 1 commit
- [Windows] Support NDArray in shared memory on Windows (#3615) · b226fe01
  Quan (Andy) Gan authored Jan 04, 2022
```
* support shared memory on windows

* Update shared_mem.cc
```
  b226fe01
19 Dec, 2021 1 commit

Fix CopyVectorToNDArray in src/c_api_common.h (#3597) · 25538ba4

hirayaku authored Dec 19, 2021



* fix CopyVectorToNDArray

* Fix lint
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

25538ba4

16 Dec, 2021 1 commit

[Feature] Add CUDA support for `min` and `max` reducer in heterogeneous API... · 70a499e3

Israt Nisa authored Dec 16, 2021


[Feature] Add CUDA support for `min` and `max` reducer in heterogeneous API for unary message functions (#3566)

* CUDA support max/min reducer on forward pass

* docstring

* concised UpdateGradMinMax_hetero

* reorganized UpdateGradMinMax_hetero

* CUDA kernels for max/min reducer

* variable name

* lint check

* changed CUDA 2D thread mapping to 1D

* removed legacy cusparse for min/max reducer

* git CI issue

* restarting git CI

* adding namespace std
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

70a499e3

15 Dec, 2021 2 commits

[PinSAGESampler] support PinSAGE sampler on GPU (#3567) · dd762a1e

lixiaobai authored Dec 15, 2021



* Feat: support API "randomwalk_topk" in library

* Feat: use the new API "randomwalk_topk" for PinSAGESampler

* Minor

* Minor

* Refactor: modified codes as checker required

* Minor

* Minor

* Minor

* Minor

* Fix: checking errors in RandomWalkTopk

* Refactor: modified the docstring for randomwalk_topk

* change randomwalk_topk to internal

* fix

* rename

* Minor for pinsage.py

* Feat: support randomwalk and SelectPinSageNeighbors on GPU

Port RandomWalk algorithm on GPU,
and port SelectPinSageNeighbors on GPU.

* Feat: support GPU on python APIs

* Feat: remove perf print information in FrequenchHashmap

* Fix: modified the code format

Modified the code format as task_lint.sh suggested

* Feat: let test script support PinSAGESampler on GPU

Let test script support PinSAGESampler on GPU,
minor of "restart_prob".

* Minor

* Minor

* Minor

* Refactor: use the atomic operations from the array module

* Minor: change the long lines

* Refactor: modified the get_node_types for gpu

* Feat: update the contributor date

* Perf: remove unnecessary stream sync

* Feat: support other random walk

But the non-uniform choice is still not supported.

* Fix: add CUDA switch for random walk
Co-authored-by: Quan Gan <coin2028@hotmail.com>

dd762a1e

[DistGNN, Graph partitioning] Libra partition (#3376) · 78e0dae6

Vasimuddin Md authored Dec 15, 2021



* added distgnn plus libra codebase

* Dist application codes

* added comments in partition code. changed the interface of partitioning call.

* updated readme

* create libra partitioning branch for the PR

* removed disgnn files for first PR

* updated kernel.cc

* added libra_partition.cc and moved libra code from kernel.cc to libra_partition.cc

* fixed lint error; merged libra2dgl.py and main_Libra.py to libra_partition.py; added graphsage/distgnn folder and partition script.

* removed libra2dgl.py

* fixed the lint error and cleaned the code.

* revisions due to PR comments. added distgnn/tools contains partitions routines

* update 2 PR revision I

* fixed errors; also improved the runtime by 10x.

* fixed minor lint error

* fixed some more lints

* PR revision II changed the interface of libra partition function

* rewrite docstring
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

78e0dae6