Commits · b1ec112eeb0e0633e57d89a60f6f80322cff0028 · OpenDAS / dgl

16 Feb, 2023 1 commit

[bugfix] Fix assertions in /src/runtime/workspace.h and expand unit tests for... · 2bbca12a

nv-dlasalle authored Feb 15, 2023

[bugfix] Fix assertions in /src/runtime/workspace.h and expand unit tests for sparse optimizer (#5299)

* Fix assertions for size 0 workspaces

* Expand unit test to cover case of communication

* Fixes

* Format

* Fix c++ formatting

2bbca12a

09 Dec, 2022 1 commit

[Bugfix] Fix empty tensors may being treated as pinned (#5005) · aad3bd04

Xin Yao authored Dec 09, 2022

* fix empty tensor is treated as pinned

* avoid calling cudaHostGetDevicePointer on nullptr

* update empty array

* add a comment

aad3bd04

22 Nov, 2022 1 commit

[Feature] (La)yer-Neigh(bor) sampling implementation (#4668) · bf264d00

Muhammed Fatih BALIN authored Nov 21, 2022



* adding LABOR sampling

* add ladies and pladies samplers

* fix compile error after rebase

* add reference for ladies sampler

* Improve ladies implementation.

* weighted labor sampling initial implementation draft
fix indentation and small bug in ladies script

* importance_sampling currently doesn't work with weights

* fix weighted importance sampling

* move labor example into its own folder

* lint fixes

* Improve documentation

* remove examples from the main PR

* fix linting by not using c++17 features

* fix documentation of labor_sampler.py

* update documentation for labor.py

* reformat the labor.py file with black

* fix linting errors

* replace exception use with if

* fix typo in error comment

* fixing win64 build for ci

* fixing weighted implementation, works now.

* fix bug in the weighted case and importance_sampling==0

* address part of the reviews

* remove unused code paths from cuda

* remove unused code path from cpu side

* remove extra features of labor making use of random seed.

* fix exclude_edges bug

* remove pcg and seed logic from cpu implementation, seed logic should still work for cuda.

* minor style change

* refactor CPU implementation, take out the importance_sampling probability computation into a function.

* improve CUDAWorkspaceAllocator

* refactor importance_sampling part out to a function

* minor optimization

* fix linting issue

* Revert "remove pcg and seed logic from cpu implementation, seed logic should still work for cuda."

This reverts commit c250e07ac6d7e13f57e79e8a2c2f098d777378c2.

* Revert "remove extra features of labor making use of random seed."

This reverts commit 7f99034353080308f4783f27d9a08bea343fb796.

* fix the documentation

* disable NIDs

* improve the documentation in the code

* use the stream argument in pcg32 instead of skipping ahead t times, can discard the use of hashmap now since it is faster this way.

* fix linting issue

* address another round of reviews

* further optimize CPU LABOR sampling implementation

* fix linting error

* update the comment

* reformat

* rename and rephrase comment

* fix formatting according to new linting specs

* fix compile error due to renaming, fix linting.

* lint

* rename DGLHeteroGraph to DGLGraph to match master

* replace other occurrences of DGLHeteroGraph to DGLGraph
Co-authored-by: Muhammed Fatih BALIN <m.f.balin@gmail.com>
Co-authored-by: Kaan Sancak <kaansnck@gmail.com>
Co-authored-by: Quan Gan <coin2028@hotmail.com>

bf264d00

10 Nov, 2022 1 commit

[Bugfix] Fix that half-precision SpMM produce incorrect results (#4842) · a8f9d5ef

Xin Yao authored Nov 10, 2022

* update accumulator

* rename half to __half

* add bfloat16

* simplify code

* fix another case

* add unit test

* disable half-precision SpMMCoo

* fix lint

a8f9d5ef

07 Nov, 2022 3 commits

[Misc] clang-format auto fix. (#4831) · 889798fe

Hongzhi (Steve), Chen authored Nov 07, 2022



* [Misc] clang-format auto fix.

* blabla

* nolint

* blabla
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

889798fe

[Misc] clang-format auto fix. (#4824) · 8ac27dad

Hongzhi (Steve), Chen authored Nov 07, 2022



* [Misc] clang-format auto fix.

* blabla

* ablabla

* blabla
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

8ac27dad

[Misc] Replace /*! with /**. (#4823) · bcd37684

Hongzhi (Steve), Chen authored Nov 07, 2022



* replace

* blabla

* balbla

* blabla
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

bcd37684

06 Nov, 2022 2 commits

[Misc] Replace \xxx with @XXX in structured comment. (#4822) · 619d735d

Hongzhi (Steve), Chen authored Nov 07, 2022



* param

* brief

* note

* return

* tparam

* brief2

* file

* return2

* return

* blabla

* all
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

619d735d

[Feature] Add bfloat16 (bf16) support (#4648) · 96297fb8

Xin Yao authored Nov 06, 2022

* add bf16 specializations

* remove SWITCH_BITS

* enable amp for bf16

* remove SWITCH_BITS for cpu kernels

* enbale bf16 based on CUDART

* fix compiling for sm<80

* fix cpu build

* enable unit tests

* update doc

* disable test for CUDA < 11.0

* address comments

* address comments

96297fb8

04 Nov, 2022 2 commits

[Misc] clang-format auto fix. (#4811) · 401e1278

Hongzhi (Steve), Chen authored Nov 04, 2022



* [Misc] clang-format auto fix.

* fix

* manual
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

401e1278

[Misc] clang-format auto fix. (#4812) · 33a2d9e1

Hongzhi (Steve), Chen authored Nov 04, 2022



* [Misc] clang-format auto fix.

* manual
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

33a2d9e1

28 Oct, 2022 1 commit

[Sampling] Enable sampling with edge masks on homogeneous graph (#4748) · 72781efb

Quan (Andy) Gan authored Oct 28, 2022

* sample neighbors with masks

* oops

* refactor again

* remove

* remove debug code

* rename macro

* address comments

* address comment

* address comments

* rename a lot of stuff

* oops

72781efb

21 Sep, 2022 1 commit
- [Fix] Enable lint check for cuh files and fix compiler warnings (#4585) · 880b3b1f
  Xin Yao authored Sep 21, 2022
```
* disable warning for tensorpipe

* fix warning

* enable lint check for cuh files

* resolve comments
```
  880b3b1f
19 Sep, 2022 1 commit

[Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) · cded5b80

Xin Yao authored Sep 19, 2022

* rename `DLContext` to `DGLContext`

* rename `kDLGPU` to `kDLCUDA`

* replace DLTensor with DGLArray

* fix linting

* Unify DGLType and DLDataType to DGLDataType

* Fix FFI

* rename DLDeviceType to DGLDeviceType

* decouple dlpack from the core library

* fix bug

* fix lint

* fix merge

* fix build

* address comments

* rename dl_converter to dlpack_convert

* remove redundant comments

cded5b80

15 Sep, 2022 1 commit

[Feature] Import PyTorch's CUDA stream management (#4503) · 9a00cf19

Xin Yao authored Sep 15, 2022

* add set_stream

* add .record_stream for NDArray and HeteroGraph

* refactor dgl stream Python APIs

* test record_stream

* add unit test for record stream

* use pytorch's stream

* fix lint

* fix cpu build

* address comments

* address comments

* add record stream tests for dgl.graph

* record frames and update dataloder

* add docstring

* update frame

* add backend check for record_stream

* remove CUDAThreadEntry::stream

* record stream for newly created formats

* fix bug

* fix cpp test

* fix None c_void_p to c_handle

9a00cf19

06 Sep, 2022 1 commit

[Feature] Unify the cuda stream used in core library (#4480) · 1c9d2a03

Chang Liu authored Sep 05, 2022



* Use an internal cuda stream for CopyDataFromTo

* small fix white space

* Fix to compile

* Make stream optional in copydata for compile

* fix lint issue

* Update cub functions to use internal stream

* Lint check

* Update CopyTo/CopyFrom/CopyFromTo to use internal stream

* Address comments

* Fix backward CUDA stream

* Avoid overloading CopyFromTo()

* Minor comment update

* Overload copydatafromto in cuda device api
Co-authored-by: xiny <xiny@nvidia.com>

1c9d2a03

05 Sep, 2022 1 commit

[Bug] Enable turn on/off libxsmm at runtime (#4455) · 62af41c2

peizhou001 authored Sep 05, 2022



* enable turn on/off libxsmm at runtime by adding a global config and related API
Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>

62af41c2

31 Aug, 2022 1 commit

[Feature] Make TensorAdapter Stream Aware (#4472) · 2b766740

Xin Yao authored Aug 31, 2022

* Allocate tensors in DGL's current stream

* make tensoradaptor stream-aware

* replace TAemtpy with cpu allocator

* fix typo

* try fix cpu allocation

* clean header

* redirect AllocDataSpace as well

* resolve comments

2b766740

23 Aug, 2022 1 commit
- fix unpinning when tensoradaptor is not available (#4450) · 1947d87d
  Xin Yao authored Aug 23, 2022
  
  1947d87d
18 Aug, 2022 1 commit

[Feature] Rework Dataloader cpu affinitization as helper method (#4126) · 47993776

Daniil Sizov authored Aug 18, 2022



* Add helper method for temporary affinitization of compute threads

* Rework DL affinitization as single helper

* Add example usage in benchmarks

* Fix python linter warnings

* Fix affinity helper params

* Use NUMA node 0 cores only by default

* Fix benchmarks

* Fix lint errors
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

47993776

15 Aug, 2022 1 commit
- [Bugfix] Fix pinning empty tensors and graphs (#4393) · 3685000a
  Xin Yao authored Aug 15, 2022
  
  3685000a
09 Jul, 2022 1 commit
- [Bugfix] Add CUDA context availability check before setting curand seed (#4223) · 1feec870
  Xin Yao authored Jul 09, 2022
  
  1feec870
07 Jul, 2022 1 commit
- [Performance] Redirect `AllocWorkspace` to PyTorch's allocator if available (#4199) · 9ee7ced5
  Xin Yao authored Jul 07, 2022
  
  9ee7ced5
29 Jun, 2022 1 commit

[bugfix] Allow communicators of size one when NCCL is missing (#3713) · 1dddaad4

nv-dlasalle authored Jun 28, 2022



* Update nccl communicator for when NCCL is missing

* Use static_cast

* Add doc string

* Fix whitespace

* Resrtict unit test to GPU runs
Co-authored-by: Xin Yao <xiny@nvidia.com>

1dddaad4

27 Jun, 2022 2 commits

[Bug][Feature] Added more missing FP16 specializations (#4140) · a5d8460c

ndickson-nvidia authored Jun 27, 2022

* * Added missing specializations for `__half` of `DLDataTypeTraits`, `IndexSelect`, `Full`, `Scatter_`, `CSRGetData`, `CSRMM`, `CSRSum`, `IndexSelectCPUFromGPU`
* Fixed casting issue in `_LinearSearchKernel` that was preventing it from supporting `__half`
* Added `#if`'d out specializations of `CSRGEMM`, `CSRGEAM`, and `Xgeam`, which would require functions that aren't currently provided by cublas

* * Added more specific error messages for unimplemented FP16 specializations of Xgeam, CSRGEMM, and CSRGEAM

* * Added missing instantiation of DLDataTypeTraits<__half>::dtype

* * Fixed linter error
* Added clearer comment explaining why the cast to long long is necessary

* * Worked around a compile error in some particular setup, where __half can't be constructed on the host side

* * Fixed linter formatting errors

* * Changes to comments as recommended

* * Made recommended changes to logging errors in FP16 specializations
* Also changed the existing Xgeam function for unsupported data types from LOG(INFO) to LOG(FATAL)

a5d8460c

[BugFix] fix rpc-related build issue on mac OS (#4168) · 10db5d0b
Rhett Ying authored Jun 27, 2022
```
* [BugFix] fix rpc-related build issue on mac OS

* add warning message

* add warning message
```
10db5d0b

23 Jun, 2022 1 commit

[Bugfix][Rework] Automatically unpin tensors pinned by DGL (rework #3997) (#4135) · 077e002f

Xin Yao authored Jun 23, 2022



* Explicitly unpin tensoradapter allocated arrays

* Undo unrelated change

* Add unit test

* update unit test

* add pinned_by_dgl flag to NDArray::Container

* use dgl.ndarray for holding the pinning status

* update multi-gpu uva inference

* reinterpret cast NDArray::Container* to DLTensor* in MoveAsDLTensor

* update unpin column and examples

* add unit test for unpin column
Co-authored-by: Dominique LaSalle <dlasalle@nvidia.com>
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>

077e002f

11 Jun, 2022 1 commit

[Fix] Wrap all CUDA runtime API/CUB calls with macro (#4083) · 60b1c992

Xin Yao authored Jun 11, 2022



* Wrap all CUDA runtime API/CUB calls with macro

* remove the usage of explicit cudaMalloc in favor of AllocWorkspace

* fix typo
Co-authored-by: Israt Nisa <neesha295@gmail.com>

60b1c992

08 Jun, 2022 1 commit

[Dist] enable time out when fetching msg (#4043) · cac3720b

Rhett Ying authored Jun 08, 2022

* [ist] enable time out when fetching msg

* fix lint error

* minor refinements

* improve minor log

* fix dist test

* fix timeout issue in tensorpipe

cac3720b

06 Jun, 2022 1 commit

wrap all cuda kernel calls with macro (#4066) · 6014623d

Xin Yao authored Jun 06, 2022


Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: Israt Nisa <neesha295@gmail.com>

6014623d

28 May, 2022 2 commits
- Change warning message for tensoradapter when not found (#4055) · 9922f41f
  Quan (Andy) Gan authored May 29, 2022
```
* change warning message

* Update tensordispatch.cc
```
  9922f41f
- Revert "[bugfix] Explicitly unpin tensoradapter allocated arrays (#3997)" (#4061) · 00c09b9f
  Quan (Andy) Gan authored May 28, 2022
```
This reverts commit fdd1fe19.
```
  00c09b9f
16 May, 2022 1 commit
- [bugfix] Explicitly unpin tensoradapter allocated arrays (#3997) · fdd1fe19
  nv-dlasalle authored May 16, 2022
```
* Explicitly unpin tensoradapter allocated arrays

* Undo unrelated change

* Add unit test

* update unit test
```
  fdd1fe19
12 May, 2022 1 commit
- Fix launch parameters index select kernel in sparse push (#3524) · 4177f729
  nv-dlasalle authored May 12, 2022
  
  4177f729
05 Apr, 2022 1 commit

[Examples] Update graphsage multi-gpu example to use mutliple GPUs for... · 27a6eb56

nv-dlasalle authored Apr 05, 2022


[Examples] Update graphsage multi-gpu example to use mutliple GPUs for validation and testing. (#3827)

* Update graphsage multi-gpu example to use mutliple GPUs for validation and
testing.

* Remove argmax

* Fix rebase error

* Add more documentation to example and simplify

* Switch to name shared memory

* Add comment about how training is distributed

* Restore iteration count

* fix munmap error reporting for better error messages
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

27a6eb56

21 Feb, 2022 1 commit

[Bugfix] Bug fixes in new dataloader (#3727) · 3f138eba

Quan (Andy) Gan authored Feb 22, 2022



* fixes

* fix

* more fixes

* update

* oops

* lint?

* temporarily revert - will fix in another PR

* more fixes

* skipping mxnet test

* address comments

* fix DDP

* fix edge dataloader exclusion problems

* stupid bug

* fix

* use_uvm option

* fix

* fixes

* fixes

* fixes

* fixes

* add evaluation for cluster gcn and ddp

* stupid bug again

* fixes

* move sanity checks to only support DGLGraphs

* pytorch lightning compatibility fixes

* remove

* poke

* more fixes

* fix

* fix

* disable test

* docstrings

* why is it getting a memory leak?

* fix

* update

* updates and temporarily disable forkingpickler

* update

* fix?

* fix?

* oops

* oops

* fix

* lint

* huh

* uh

* update

* fix

* made it memory efficient

* refine exclude interface

* fix tutorial

* fix tutorial

* fix graph duplication in CPU dataloader workers

* lint

* lint

* Revert "lint"

This reverts commit 805484dd553695111b5fb37f2125214a6b7276e9.

* Revert "lint"

This reverts commit 0bce411b2b415c2ab770343949404498436dc8b2.

* Revert "fix graph duplication in CPU dataloader workers"

This reverts commit 9e3a8cf34c175d3093c773f6bb023b155f2bd27f.
Co-authored-by: xiny <xiny@nvidia.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

3f138eba

18 Feb, 2022 1 commit

DGL Enter (#3690) · 539335ce

Jinjing Zhou authored Feb 18, 2022



* add

* fix

* fix

* fix

* fix

* add

* add

* fix

* fix

* fix

* new loader

* fix

* fix

* fix for 3.6

* fix

* add

* add receipes and also some bug fixes

* fix

* fix

* fix

* fix receipies

* allow AsNodeDataset to work on ogb

* add ut

* many fixes for nodepred-ns pipeline

* receipe for nodepred-ns

* Update enter/README.md
Co-authored-by: Zihao Ye <zihaoye.cs@gmail.com>

* fix layers

* fix

* fix

* fix

* fix

* fix multiple issues

* fix for citation2

* fix comment

* fix

* fix

* clean up

* fix
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: Minjie Wang <minjie.wang@nyu.edu>
Co-authored-by: Zihao Ye <zihaoye.cs@gmail.com>

539335ce

09 Feb, 2022 1 commit

[Feature] CUDA UVA sampling for MultiLayerNeighborSampler (#3674) · 738e8318

Xin Yao authored Feb 09, 2022



* implement pin_memory/unpin_memory/is_pinned for dgl.graph

* update python docstring

* update c++ docstring

* add test

* fix the broken UnifiedTensor

* XPU_SWITCH for kDLCPUPinned

* a rough version ready for testing

* eliminate extra context parameter for pin/unpin

* update train_sampling

* fix linting

* fix typo

* multi-gpu uva sampling case

* disable new format materialization for pinned graphs

* update python doc for pin_memory_

* fix unit test

* UVA sampling for link prediction

* dispatch most csr ops

* update graphsage example to combine uva sampling and UnifiedTensor

* update graphsage example to combine uva sampling and UnifiedTensor

* update graphsage example to combine uva sampling and UnifiedTensor

* update doc

* update examples

* change unitgraph and heterograph's PinMemory to in-place

* update examples for multi-gpu uva sampling

* update doc

* fix linting

* fix cpu build

* fix is_pinned for DistGraph

* fix is_pinned for DistGraph

* update graphsage unsupervised example

* update doc for gpu sampling

* update some check for sampling device switching

* fix linting

* adapt for new dataloader

* fix linting

* fix

* fix some name issue

* adjust device check

* add unit test for uva sampling & fix some zero_copy bug

* fix linting

* update num_threads in graphsage examples
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

738e8318

21 Jan, 2022 1 commit

[Feature] Pin dgl.graph to the page-locked memory (#3616) · 40b44a43

Xin Yao authored Jan 21, 2022



* implement pin_memory/unpin_memory/is_pinned for dgl.graph

* update python docstring

* update c++ docstring

* add test

* fix the broken UnifiedTensor

* eliminate extra context parameter for pin/unpin

* fix linting

* fix typo

* disable new format materialization for pinned graphs

* update python doc for pin_memory_

* fix unit test

* update doc

* change unitgraph and heterograph's PinMemory to in-place

* update comments for NDArray's PinMemory_ and PinData

* update doc
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

40b44a43

07 Jan, 2022 1 commit

[Feature] Negative sampling (#3599) · 90f10b31

Quan (Andy) Gan authored Jan 07, 2022

* first commit

* a bunch of fixes

* add unique

* lint

* lint

* lint

* address comments

* Update negative_sampler.py

* fix

* description

* address comments and fix

* fix

* replace unique with replace

* test pylint

* Update negative_sampler.py

90f10b31