Commits · 15365d7855ac6bb8ceba153d30dc0d9144a69629 · OpenDAS / dgl

12 Dec, 2022 1 commit
- [Sparse] Add SpMM and SDDMM. (#4999) · 15365d78
  czkkkkkk authored Dec 12, 2022
```
* [Sparse] Add SpMM and SDDMM

* Update

* Add CSR and CSC SpMM tests
```
  15365d78
09 Dec, 2022 1 commit

[Bugfix] Fix empty tensors may being treated as pinned (#5005) · aad3bd04

Xin Yao authored Dec 09, 2022

* fix empty tensor is treated as pinned

* avoid calling cudaHostGetDevicePointer on nullptr

* update empty array

* add a comment

aad3bd04

06 Dec, 2022 1 commit

Add support for next cusparse release (#4974) · fb223d47

Chang Liu authored Dec 05, 2022

* Add support for next cusparse release

* Fix lint

* Add switch and tune the performance

* Fix lint issue

* Fine tune the heuristics

* Fix lint issue

* Address comments

* Minor fix

* Address comments

fb223d47

01 Dec, 2022 1 commit

[Feature] replace dgl PRNG with pcg32 (#4807) · b1e2695f

Muhammed Fatih BALIN authored Nov 30, 2022



* replace dgl PRNG with pcg32

* remove pcg submodule, add a simple implementation

* replace pcg32 with std::mt19937_64

* fix include order

* change RandomEngine to pcg32

* Remove custom pcg32 implementation, use the submodule provided by the original author.

* minor bug

* move include for linting

* include pcg for tests too
Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>

b1e2695f

24 Nov, 2022 1 commit
- [Cleanup] Remove duplicated _IndexSelect (#4874) · c59000ac
  Xin Yao authored Nov 24, 2022
  
  c59000ac
22 Nov, 2022 2 commits

[Performance] Leverage hashmap to accelerate CSRSliceMatrix<kDGLCUDA, IdType> (#4924) · aa419895

Ping Gong authored Nov 22, 2022



* Leverage hashmap to accelerate CSRSliceMatrix

* fix lint check

* use `min` in cuda_runtime.ch

* fix hash func

* add some comments and adjust the <grid,block> of the _SegmentMaskColKernel kernel

* set device and stream for thrust::for_each

* use thrust::cuda::par_nosync
Co-authored-by: Xin Yao <xiny@nvidia.com>

aa419895

[Feature] (La)yer-Neigh(bor) sampling implementation (#4668) · bf264d00

Muhammed Fatih BALIN authored Nov 21, 2022



* adding LABOR sampling

* add ladies and pladies samplers

* fix compile error after rebase

* add reference for ladies sampler

* Improve ladies implementation.

* weighted labor sampling initial implementation draft
fix indentation and small bug in ladies script

* importance_sampling currently doesn't work with weights

* fix weighted importance sampling

* move labor example into its own folder

* lint fixes

* Improve documentation

* remove examples from the main PR

* fix linting by not using c++17 features

* fix documentation of labor_sampler.py

* update documentation for labor.py

* reformat the labor.py file with black

* fix linting errors

* replace exception use with if

* fix typo in error comment

* fixing win64 build for ci

* fixing weighted implementation, works now.

* fix bug in the weighted case and importance_sampling==0

* address part of the reviews

* remove unused code paths from cuda

* remove unused code path from cpu side

* remove extra features of labor making use of random seed.

* fix exclude_edges bug

* remove pcg and seed logic from cpu implementation, seed logic should still work for cuda.

* minor style change

* refactor CPU implementation, take out the importance_sampling probability computation into a function.

* improve CUDAWorkspaceAllocator

* refactor importance_sampling part out to a function

* minor optimization

* fix linting issue

* Revert "remove pcg and seed logic from cpu implementation, seed logic should still work for cuda."

This reverts commit c250e07ac6d7e13f57e79e8a2c2f098d777378c2.

* Revert "remove extra features of labor making use of random seed."

This reverts commit 7f99034353080308f4783f27d9a08bea343fb796.

* fix the documentation

* disable NIDs

* improve the documentation in the code

* use the stream argument in pcg32 instead of skipping ahead t times, can discard the use of hashmap now since it is faster this way.

* fix linting issue

* address another round of reviews

* further optimize CPU LABOR sampling implementation

* fix linting error

* update the comment

* reformat

* rename and rephrase comment

* fix formatting according to new linting specs

* fix compile error due to renaming, fix linting.

* lint

* rename DGLHeteroGraph to DGLGraph to match master

* replace other occurrences of DGLHeteroGraph to DGLGraph
Co-authored-by: Muhammed Fatih BALIN <m.f.balin@gmail.com>
Co-authored-by: Kaan Sancak <kaansnck@gmail.com>
Co-authored-by: Quan Gan <coin2028@hotmail.com>

bf264d00

15 Nov, 2022 4 commits
- Revert "[Kernel] Parallel find edges (#4878)" (#4899) · ca144886
  Quan (Andy) Gan authored Nov 15, 2022
```
This reverts commit 00c27cb2.
```
  ca144886
- Revert "[Performance] Make IdHashMap parallel (#4881)" (#4898) · 5b193f9b
  Quan (Andy) Gan authored Nov 15, 2022
```
This reverts commit 56962858.
```
  5b193f9b
- [Performance] Make IdHashMap parallel (#4881) · 56962858
  Quan (Andy) Gan authored Nov 15, 2022
```
* make IdHashMap parallel

* fix

* Update array_utils.h
```
  56962858
- [Kernel] Parallel find edges (#4878) · 00c27cb2
  Quan (Andy) Gan authored Nov 15, 2022
```
* use runtime parallel_for

* grain size

* Update array_index_select.cc
```
  00c27cb2
10 Nov, 2022 1 commit

[Bugfix] Fix that half-precision SpMM produce incorrect results (#4842) · a8f9d5ef

Xin Yao authored Nov 10, 2022

* update accumulator

* rename half to __half

* add bfloat16

* simplify code

* fix another case

* add unit test

* disable half-precision SpMMCoo

* fix lint

a8f9d5ef

08 Nov, 2022 2 commits

[Misc] Minor code style fix. (#4843) · cb5e3489

Hongzhi (Steve), Chen authored Nov 08, 2022



* [Misc] Change the max line length for cpp to 80 in lint.

* blabla

* blabla

* blabla

* ablabla
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

cb5e3489

[Misc] Add // NOLINT for the very long code. (#4834) · 0d687968

Hongzhi (Steve), Chen authored Nov 08, 2022



* alternative

* fix

* remove_todo

* blabl

* ablabl
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

0d687968

07 Nov, 2022 4 commits

[Misc] clang-format auto fix. (#4831) · 889798fe

Hongzhi (Steve), Chen authored Nov 07, 2022



* [Misc] clang-format auto fix.

* blabla

* nolint

* blabla
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

889798fe

[Misc] Minor code style fix. (#4825) · df089424

Hongzhi (Steve), Chen authored Nov 07, 2022



* blabla

* more

* blabla

* blabla

* ablabla

* blabla
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

df089424

[Misc] clang-format auto fix. (#4824) · 8ac27dad

Hongzhi (Steve), Chen authored Nov 07, 2022



* [Misc] clang-format auto fix.

* blabla

* ablabla

* blabla
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

8ac27dad

[Misc] Replace /*! with /**. (#4823) · bcd37684

Hongzhi (Steve), Chen authored Nov 07, 2022



* replace

* blabla

* balbla

* blabla
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

bcd37684

06 Nov, 2022 2 commits

[Misc] Replace \xxx with @XXX in structured comment. (#4822) · 619d735d

Hongzhi (Steve), Chen authored Nov 07, 2022



* param

* brief

* note

* return

* tparam

* brief2

* file

* return2

* return

* blabla

* all
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

619d735d

[Feature] Add bfloat16 (bf16) support (#4648) · 96297fb8

Xin Yao authored Nov 06, 2022

* add bf16 specializations

* remove SWITCH_BITS

* enable amp for bf16

* remove SWITCH_BITS for cpu kernels

* enbale bf16 based on CUDART

* fix compiling for sm<80

* fix cpu build

* enable unit tests

* update doc

* disable test for CUDA < 11.0

* address comments

* address comments

96297fb8

03 Nov, 2022 2 commits

[Misc] clang-format auto fix. (#4804) · 8ae50c42

Hongzhi (Steve), Chen authored Nov 03, 2022



* [Misc] clang-format auto fix.

* manual

* manual

* manual

* manual

* todo

* fix
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

8ae50c42

[Bugfix] Fix that UVA cannot work on old GPUs (#4781) · 16e771c0
Xin Yao authored Nov 03, 2022
```
* get device pointers

* change if condition to IsPinned
```
16e771c0

02 Nov, 2022 1 commit

[Misc] clang-format auto fix. (#4803) · b2d38ca8

Hongzhi (Steve), Chen authored Nov 02, 2022



* [Misc] clang-format auto fix.

* manual
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

b2d38ca8

29 Oct, 2022 1 commit

[Sampling] Enable sampling with edge masks in sample_etype_neighbors (#4749) · 2bca4759

Quan (Andy) Gan authored Oct 29, 2022

* sample neighbors with masks

* oops

* refactor again

* remove

* remove debug code

* rename macro

* address comments

* more stuff

* remove

* fix

* try fix unit test

* oops

* fix test

* oops

* change name

* rename a lot of stuff

* oops

* ugh

* misc fixes

* lint

* address a lot of comments

* lint

* lint

* fix

* that was silly

* fix

* fix

* fix

* oops

2bca4759

28 Oct, 2022 1 commit

[Sampling] Enable sampling with edge masks on homogeneous graph (#4748) · 72781efb

Quan (Andy) Gan authored Oct 28, 2022

* sample neighbors with masks

* oops

* refactor again

* remove

* remove debug code

* rename macro

* address comments

* address comment

* address comments

* rename a lot of stuff

* oops

72781efb

13 Oct, 2022 2 commits

[Sampling] handle fanout=-1 differently from fanout>0 in sample_etype_neighbors() (#4716) · a5d21c2b
Rhett Ying authored Oct 13, 2022

a5d21c2b

[Deprecation] Dataset Attributes (#4666) · e452179c

Mufei Li authored Oct 13, 2022



* Update from master (#4584)

* [Example][Refactor] Refactor graphsage multigpu and full-graph example (#4430)

* Add refactors for multi-gpu and full-graph example

* Fix format

* Update

* Update

* Update

* [Cleanup] Remove async_transferer (#4505)

* Remove async_transferer

* remove test

* Remove AsyncTransferer
Co-authored-by: Xin Yao <xiny@nvidia.com>
Co-authored-by: Xin Yao <yaox12@outlook.com>

* [Cleanup] Remove duplicate entries of CUB submodule   (issue# 4395) (#4499)

* remove third_part/cub

* remove from third_party
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: Xin Yao <xiny@nvidia.com>

* [Bug] Enable turn on/off libxsmm at runtime (#4455)

* enable turn on/off libxsmm at runtime by adding a global config and related API
Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>

* [Feature] Unify the cuda stream used in core library (#4480)

* Use an internal cuda stream for CopyDataFromTo

* small fix white space

* Fix to compile

* Make stream optional in copydata for compile

* fix lint issue

* Update cub functions to use internal stream

* Lint check

* Update CopyTo/CopyFrom/CopyFromTo to use internal stream

* Address comments

* Fix backward CUDA stream

* Avoid overloading CopyFromTo()

* Minor comment update

* Overload copydatafromto in cuda device api
Co-authored-by: xiny <xiny@nvidia.com>

* [Feature] Added exclude_self and output_batch to knn graph construction (Issues #4323 #4316) (#4389)

* * Added "exclude_self" and "output_batch" options to knn_graph and segmented_knn_graph
* Updated out-of-date comments on remove_edges and remove_self_loop, since they now preserve batch information

* * Changed defaults on new knn_graph and segmented_knn_graph function parameters, for compatibility; pytorch/test_geometry.py was failing

* * Added test to ensure dgl.remove_self_loop function correctly updates batch information

* * Added new knn_graph and segmented_knn_graph parameters to dgl.nn.KNNGraph and dgl.nn.SegmentedKNNGraph

* * Formatting

* * Oops, I missed the one in segmented_knn_graph when I fixed the similar thing in knn_graph

* * Fixed edge case handling when invalid k specified, since it still needs to be handled consistently for tests to pass
* Fixed context of batch info, since it must match the context of the input position data for remove_self_loop to succeed

* * Fixed batch info resulting from knn_graph when output_batch is true, for case of 3D input tensor, representing multiple segments

* * Added testing of new exclude_self and output_batch parameters on knn_graph and segmented_knn_graph, and their wrappers, KNNGraph and SegmentedKNNGraph, into the test_knn_cuda test

* * Added doc comments for new parameters

* * Added correct handling for uncommon case of k or more coincident points when excluding self edges in knn_graph and segmented_knn_graph
* Added test cases for more than k coincident points

* * Updated doc comments for output_batch parameters for clarity

* * Linter formatting fixes

* * Extracted out common function for test_knn_cpu and test_knn_cuda, to add the new test cases to test_knn_cpu

* * Rewording in doc comments

* * Removed output_batch parameter from knn_graph and segmented_knn_graph, in favour of always setting the batch information, except in knn_graph if x is a 2D tensor
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

* [CI] only known devs are authorized to trigger CI (#4518)

* [CI] only known devs are authorized to trigger CI

* fix if author is null

* add comments

* [Readability] Auto fix setup.py and update-version.py (#4446)

* Auto fix update-version

* Auto fix setup.py

* Auto fix update-version

* Auto fix setup.py

* [Doc] Change random.py to random_partition.py in guide on distributed partition pipeline (#4438)

* Update distributed-preprocessing.rst

* Update
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>

* fix unpinning when tensoradaptor is not available (#4450)

* [Doc] fix print issue in tutorial (#4459)

* [Example][Refactor] Refactor RGCN example (#4327)

* Refactor full graph entity classification

* Refactor rgcn with sampling

* README update

* Update

* Results update

* Respect default setting of self_loop=false in entity.py

* Update

* Update README

* Update for multi-gpu

* Update

* [doc] fix invalid link in user guide (#4468)

* [Example] directional_GSN for ogbg-molpcba (#4405)

* version-1

* version-2

* version-3

* update examples/README

* Update .gitignore

* update performance in README, delete scripts

* 1st approving review

* 2nd approving review
Co-authored-by: Mufei Li <mufeili1996@gmail.com>

* Clarify the message name, which is 'm'. (#4462)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com>

* [Refactor] Auto fix view.py. (#4461)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

* [Example] SEAL for OGBL (#4291)

* [Example] SEAL for OGBL

* update index

* update

* fix readme typo

* add seal sampler

* modify set ops

* prefetch

* efficiency test

* update

* optimize

* fix ScatterAdd dtype issue

* update sampler style

* update
Co-authored-by: Quan Gan <coin2028@hotmail.com>

* [CI] use https instead of http (#4488)

* [BugFix] fix crash due to incorrect dtype in dgl.to_block() (#4487)

* [BugFix] fix crash due to incorrect dtype in dgl.to_block()

* fix test failure in TF

* [Feature] Make TensorAdapter Stream Aware (#4472)

* Allocate tensors in DGL's current stream

* make tensoradaptor stream-aware

* replace TAemtpy with cpu allocator

* fix typo

* try fix cpu allocation

* clean header

* redirect AllocDataSpace as well

* resolve comments

* [Build][Doc] Specify the sphinx version (#4465)
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

* reformat

* reformat

* Auto fix update-version

* Auto fix setup.py

* reformat

* reformat
Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
Co-authored-by: Mufei Li <mufeili1996@gmail.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>
Co-authored-by: Xin Yao <xiny@nvidia.com>
Co-authored-by: Chang Liu <chang.liu@utexas.edu>
Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: rudongyu <ru_dongyu@outlook.com>
Co-authored-by: Quan Gan <coin2028@hotmail.com>

* Move mock version of dgl_sparse library to DGL main repo (#4524)

* init

* Add api doc for sparse library

* support op btwn matrices with differnt sparsity

* Fixed docstring

* addresses comments

* lint check

* change keyword format to fmt
Co-authored-by: Israt Nisa <nisisrat@amazon.com>

* [DistPart] expose timeout config for process group (#4532)

* [DistPart] expose timeout config for process group

* refine code

* Update tools/distpartitioning/data_proc_pipeline.py
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

* [Feature] Import PyTorch's CUDA stream management (#4503)

* add set_stream

* add .record_stream for NDArray and HeteroGraph

* refactor dgl stream Python APIs

* test record_stream

* add unit test for record stream

* use pytorch's stream

* fix lint

* fix cpu build

* address comments

* address comments

* add record stream tests for dgl.graph

* record frames and update dataloder

* add docstring

* update frame

* add backend check for record_stream

* remove CUDAThreadEntry::stream

* record stream for newly created formats

* fix bug

* fix cpp test

* fix None c_void_p to c_handle

* [examples]educe memory consumption (#4558)

* [examples]educe memory consumption

* reffine help message

* refine

* [Feature][REVIEW] Enable DGL cugaph nightly CI  (#4525)

* Added cugraph nightly scripts

* Removed nvcr.io//nvidia/pytorch:22.04-py3 reference
Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com>

* Revert "[Feature][REVIEW] Enable DGL cugaph nightly CI  (#4525)" (#4563)

This reverts commit ec171c64

.

* [Misc] Add flake8 lint workflow. (#4566)

* Add pyproject.toml for autopep8.

* Add pyproject.toml for autopep8.

* Add flake8 annotation in workflow.

* remove

* add

* clean up
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Misc] Try use official pylint workflow. (#4568)

* polish update_version

* update pylint workflow.

* add

* revert.
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [CI] refine stage logic (#4565)

* [CI] refine stage logic

* refine

* refine

* remove (#4570)
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Add Pylint workflow for flake8. (#4571)

* remove

* Add pylint.
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Misc] Update the python version in Pylint workflow for flake8. (#4572)

* remove

* Add pylint.

* Change the python version for pylint.
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Update pylint. (#4574)
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Misc] Use another workflow. (#4575)

* Update pylint.

* Use another workflow.
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Update pylint. (#4576)
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Update pylint.yml

* Update pylint.yml

* Delete pylint.yml

* [Misc]Add pyproject.toml for autopep8 & black. (#4543)

* Add pyproject.toml for autopep8.

* Add pyproject.toml for autopep8.
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454)

* rename `DLContext` to `DGLContext`

* rename `kDLGPU` to `kDLCUDA`

* replace DLTensor with DGLArray

* fix linting

* Unify DGLType and DLDataType to DGLDataType

* Fix FFI

* rename DLDeviceType to DGLDeviceType

* decouple dlpack from the core library

* fix bug

* fix lint

* fix merge

* fix build

* address comments

* rename dl_converter to dlpack_convert

* remove redundant comments
Co-authored-by: Chang Liu <chang.liu@utexas.edu>
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: Xin Yao <xiny@nvidia.com>
Co-authored-by: Xin Yao <yaox12@outlook.com>
Co-authored-by: Israt Nisa <neesha295@gmail.com>
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: peizhou001 <110809584+peizhou001@users.noreply.github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>
Co-authored-by: ndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>
Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com>
Co-authored-by: rudongyu <ru_dongyu@outlook.com>
Co-authored-by: Quan Gan <coin2028@hotmail.com>
Co-authored-by: Vibhu Jawa <vibhujawa@gmail.com>

* [Deprecation] Dataset Attributes (#4546)

* Update

* CI

* CI

* Update
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>

* [Example] Bug Fix (#4665)

* Update

* CI

* CI

* Update

* Update
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>

* Update
Co-authored-by: Chang Liu <chang.liu@utexas.edu>
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: Xin Yao <xiny@nvidia.com>
Co-authored-by: Xin Yao <yaox12@outlook.com>
Co-authored-by: Israt Nisa <neesha295@gmail.com>
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: peizhou001 <110809584+peizhou001@users.noreply.github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>
Co-authored-by: ndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>
Co-authored-by: Zhiteng Li <55398076+ZHITENGLI@users.noreply.github.com>
Co-authored-by: rudongyu <ru_dongyu@outlook.com>
Co-authored-by: Quan Gan <coin2028@hotmail.com>
Co-authored-by: Vibhu Jawa <vibhujawa@gmail.com>

e452179c

11 Oct, 2022 1 commit

[Misc] ClangFormat auto fix. (#4685) · bd3fe59e

Hongzhi (Steve), Chen authored Oct 11, 2022



* Auto fix c++.

* reformat
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

bd3fe59e

21 Sep, 2022 1 commit
- [Fix] Enable lint check for cuh files and fix compiler warnings (#4585) · 880b3b1f
  Xin Yao authored Sep 21, 2022
```
* disable warning for tensorpipe

* fix warning

* enable lint check for cuh files

* resolve comments
```
  880b3b1f
19 Sep, 2022 1 commit

[Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) · cded5b80

Xin Yao authored Sep 19, 2022

* rename `DLContext` to `DGLContext`

* rename `kDLGPU` to `kDLCUDA`

* replace DLTensor with DGLArray

* fix linting

* Unify DGLType and DLDataType to DGLDataType

* Fix FFI

* rename DLDeviceType to DGLDeviceType

* decouple dlpack from the core library

* fix bug

* fix lint

* fix merge

* fix build

* address comments

* rename dl_converter to dlpack_convert

* remove redundant comments

cded5b80

15 Sep, 2022 1 commit

[Feature] Import PyTorch's CUDA stream management (#4503) · 9a00cf19

Xin Yao authored Sep 15, 2022

* add set_stream

* add .record_stream for NDArray and HeteroGraph

* refactor dgl stream Python APIs

* test record_stream

* add unit test for record stream

* use pytorch's stream

* fix lint

* fix cpu build

* address comments

* address comments

* add record stream tests for dgl.graph

* record frames and update dataloder

* add docstring

* update frame

* add backend check for record_stream

* remove CUDAThreadEntry::stream

* record stream for newly created formats

* fix bug

* fix cpp test

* fix None c_void_p to c_handle

9a00cf19

06 Sep, 2022 1 commit

[Feature] Unify the cuda stream used in core library (#4480) · 1c9d2a03

Chang Liu authored Sep 05, 2022



* Use an internal cuda stream for CopyDataFromTo

* small fix white space

* Fix to compile

* Make stream optional in copydata for compile

* fix lint issue

* Update cub functions to use internal stream

* Lint check

* Update CopyTo/CopyFrom/CopyFromTo to use internal stream

* Address comments

* Fix backward CUDA stream

* Avoid overloading CopyFromTo()

* Minor comment update

* Overload copydatafromto in cuda device api
Co-authored-by: xiny <xiny@nvidia.com>

1c9d2a03

05 Sep, 2022 1 commit

[Bug] Enable turn on/off libxsmm at runtime (#4455) · 62af41c2

peizhou001 authored Sep 05, 2022



* enable turn on/off libxsmm at runtime by adding a global config and related API
Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>

62af41c2

12 Aug, 2022 1 commit
- [Performance] Improve the performance of SpMMCsr by reconfiguration (#4363) · 2523bc7a
  Xin Yao authored Aug 12, 2022
```
* Change CUDA_MAX_NUM_THREADS to 256

* change the configuration of grid
```
  2523bc7a
09 Aug, 2022 1 commit
- [Bug] Fix broken static_assert (#4342) · 182e1ad5
  Xin Yao authored Aug 09, 2022
  
  182e1ad5
01 Aug, 2022 1 commit

[Feature] Enable UVA for Weighted Samplers (#4314) · 44b68641

Xin Yao authored Aug 01, 2022

* enable use for weighted neighbor sampler and biased random walk

* add unit tests

* fix for mxnet/tf

* fix typo

44b68641

29 Jul, 2022 1 commit

[Feature] Add CUDA Weighted Neighborhood Sampling (#4064) · 86c81b4e

Xin Yao authored Jul 29, 2022



* add weighted sampling without replacement (A-Chao)

* improve Algorithm A-Chao with block-wise prefix sum

* correctly fill out_idxs

* implement weighted sampling with replacement

* small fix

* merge host-side code of weighted/uniform sampling

* enable unit tests for cuda weighted sampling

* move thrust/cub wrapper to the cmake file

* update docs accordingly

* fix linting

* fix linting

* fix unit test

* Bump external CUB/Thrust versions

* Fix code style and update description of algorithm design

* [Feature] GPU support weighted graph neighbor sampling
commit by pengqirong(OPPO)

* merge pengqirong's implementation

* revert the change to cub and thrust

* fix linting

* use DeviceSegmentedSort for better performance

* add more comments

* add necessary notes

* add necessary notes

* resolve some comments

* define THRUST_CUB_WRAPPED_NAMESPACE

* fix doc
Co-authored-by: 彭齐荣 <657017034@qq.com>

86c81b4e

15 Jul, 2022 1 commit
- decompose (#4259) · 9a7ad16e
  Quan (Andy) Gan authored Jul 15, 2022
  
  9a7ad16e
01 Jul, 2022 2 commits
- [BugFix] check whether etype sorted when sampling (#4198) · dcf16992
  Rhett Ying authored Jul 01, 2022
  
  dcf16992
- [Feature] extend sort_csr/csc_by_tag to edge (#4164) · 6a6597a0
  Rhett Ying authored Jul 01, 2022
```
* [Feature] extend sort_csr/csc_by_tag to edge

* fix test ffailure in tensorflow

* refine sorting by edges

* fix docstring

* remove unnecessary mem
Co-authored-by: Xin Yao <xiny@nvidia.com>
```
  6a6597a0