Commits · 9a00cf194fcf994b2527cd927d691144f5e9c47b · OpenDAS / dgl

15 Sep, 2022 1 commit

[Feature] Import PyTorch's CUDA stream management (#4503) · 9a00cf19

Xin Yao authored Sep 15, 2022

* add set_stream

* add .record_stream for NDArray and HeteroGraph

* refactor dgl stream Python APIs

* test record_stream

* add unit test for record stream

* use pytorch's stream

* fix lint

* fix cpu build

* address comments

* address comments

* add record stream tests for dgl.graph

* record frames and update dataloder

* add docstring

* update frame

* add backend check for record_stream

* remove CUDAThreadEntry::stream

* record stream for newly created formats

* fix bug

* fix cpp test

* fix None c_void_p to c_handle

9a00cf19

06 Sep, 2022 1 commit

[Feature] Unify the cuda stream used in core library (#4480) · 1c9d2a03

Chang Liu authored Sep 05, 2022



* Use an internal cuda stream for CopyDataFromTo

* small fix white space

* Fix to compile

* Make stream optional in copydata for compile

* fix lint issue

* Update cub functions to use internal stream

* Lint check

* Update CopyTo/CopyFrom/CopyFromTo to use internal stream

* Address comments

* Fix backward CUDA stream

* Avoid overloading CopyFromTo()

* Minor comment update

* Overload copydatafromto in cuda device api
Co-authored-by: xiny <xiny@nvidia.com>

1c9d2a03

27 Jun, 2022 1 commit

[Bug][Feature] Added more missing FP16 specializations (#4140) · a5d8460c

ndickson-nvidia authored Jun 27, 2022

* * Added missing specializations for `__half` of `DLDataTypeTraits`, `IndexSelect`, `Full`, `Scatter_`, `CSRGetData`, `CSRMM`, `CSRSum`, `IndexSelectCPUFromGPU`
* Fixed casting issue in `_LinearSearchKernel` that was preventing it from supporting `__half`
* Added `#if`'d out specializations of `CSRGEMM`, `CSRGEAM`, and `Xgeam`, which would require functions that aren't currently provided by cublas

* * Added more specific error messages for unimplemented FP16 specializations of Xgeam, CSRGEMM, and CSRGEAM

* * Added missing instantiation of DLDataTypeTraits<__half>::dtype

* * Fixed linter error
* Added clearer comment explaining why the cast to long long is necessary

* * Worked around a compile error in some particular setup, where __half can't be constructed on the host side

* * Fixed linter formatting errors

* * Changes to comments as recommended

* * Made recommended changes to logging errors in FP16 specializations
* Also changed the existing Xgeam function for unsupported data types from LOG(INFO) to LOG(FATAL)

a5d8460c

04 Nov, 2021 1 commit

[Feature] aten::Relabel_() for the GPU (#3445) · d3ae7544

Xin Yao authored Nov 04, 2021



* relabel gpu

* unittest for ralebl_ on the GPU

* finish Relabel_ for the GPU

* copyright

* re-enable the unittest for edge_subgrah on the GPU

* fix unittest for tensorflow

* use a fixed number of threads
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

d3ae7544

27 Apr, 2021 1 commit

[Feature] Add cuda support for Sparse Matrix multiplication, summation and masking (#2782) · ab2bd1f1

Israt Nisa authored Apr 27, 2021



* init cuda support

* cuSPARSE err

* passed unittest for csr_mm/SpGEMM. int64 not supported

* Debugging cuSPARSE error 3

* csrgeam only supports int32?

* disabling int64 for cuda

* refactor and add CSRMask

* lint

* oops

* remove todo

* rewrite CSRMask with CSRGetData

* lint

* fix test

* address comments

* lint

* fix

* addresses comments and rename BUG_ON
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-30-71.ec2.internal>
Co-authored-by: Quan Gan <coin2028@hotmail.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

ab2bd1f1

10 Sep, 2020 1 commit
- [hotfix] Skip CUDA kernel launch when number of blocks/threads is zero. (#2144) · 2c04ecb5
  Zihao Ye authored Sep 10, 2020
```
* upd

* upd

* upd

* upd

* lint

* upd

* upd

* fmt
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
```
  2c04ecb5
30 Jul, 2020 1 commit

[CUDA][Kernel] A bunch of int64 kernels for COO and CSR (#1883) · f4608c22

Minjie Wang authored Jul 30, 2020

* COO sort

* COOToCSR

* CSR2COO

* CSRSort; CSRTranspose

* pass all CSR tests

* lint

* remove int32 conversion

* fix tensorflow nn tests

* turn on CI

* fix

* addreess comments

f4608c22

28 Jun, 2020 1 commit

[CUDA][Kernel] More CUDA kernels; Standardize the behavior for sorted COO/CSR (#1704) · 870da747

Minjie Wang authored Jun 28, 2020

* add cub; array cumsum

* CSRSliceRows

* fix warning

* operator << for ndarray; CSRSliceRows

* add CSRIsSorted

* add csr_sort

* inplace coosort and outplace csrsort

* WIP: coo is sorted

* mv cuda_utils

* add AllTrue utility

* csr sort

* coo sort

* coo2csr for sorted coo arrays

* CSRToCOO from sorted

* pass tests for the new kernel changes

* cannot use inplace sort

* lint

* try fix msvc error

* Fix g.copy_to and g.asnumbits; ToBlock no longer uses CSC

* stash

* revert some hack

* revert some changes

* address comments

* fix

* fix to_block unittest

* add todo note

870da747

19 Jun, 2020 1 commit

[CUDA] Many CUDA operators; Prepare for DGLGraph on CUDA (#1660) · f1b19a6b

Minjie Wang authored Jun 19, 2020

* add cuda utils; change g.to; add g.device

* split array.h into several headers

* cuda index select

* file

* three cuda kernels

* add cuda elementwise arith and several others

* cuda CSRIsNonZero

* fix lint

* lint

* lint

* fix bug in changing ctx to property

* address comments

* remove unused codes

* address comments

f1b19a6b

15 Jun, 2020 1 commit

[Kernel] CUDA CSR2COO COOSort COO2CSR (#1620) · d6d517bb

Minjie Wang authored Jun 15, 2020



* add cuda source

* moving codes from kernel2 branch

* operator overloading

* Better error message for unsupported device

* fix c tests

* coo sort using cusparse

* move test_rpc to distributed

* lint

* address comments and add utests
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Chao Ma <mctt90@gmail.com>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

d6d517bb