Commits · 40b44a43a98b4a5dbc088fd1a6839adbda370fb5 · OpenDAS / dgl

21 Jan, 2022 1 commit

[Feature] Pin dgl.graph to the page-locked memory (#3616) · 40b44a43

Xin Yao authored Jan 21, 2022



* implement pin_memory/unpin_memory/is_pinned for dgl.graph

* update python docstring

* update c++ docstring

* add test

* fix the broken UnifiedTensor

* eliminate extra context parameter for pin/unpin

* fix linting

* fix typo

* disable new format materialization for pinned graphs

* update python doc for pin_memory_

* fix unit test

* update doc

* change unitgraph and heterograph's PinMemory to in-place

* update comments for NDArray's PinMemory_ and PinData

* update doc
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

40b44a43

07 Jan, 2022 1 commit

[Feature] Negative sampling (#3599) · 90f10b31

Quan (Andy) Gan authored Jan 07, 2022

* first commit

* a bunch of fixes

* add unique

* lint

* lint

* lint

* address comments

* Update negative_sampler.py

* fix

* description

* address comments and fix

* fix

* replace unique with replace

* test pylint

* Update negative_sampler.py

90f10b31

04 Jan, 2022 1 commit
- [Windows] Support NDArray in shared memory on Windows (#3615) · b226fe01
  Quan (Andy) Gan authored Jan 04, 2022
```
* support shared memory on windows

* Update shared_mem.cc
```
  b226fe01
18 Oct, 2021 1 commit
- [Fix] Split nccl sparse push into two groups (#3404) · c560040f
  nv-dlasalle authored Oct 18, 2021
  
  c560040f
15 Oct, 2021 1 commit

[Bugfix] Add UVM specialized IndexSelect kernels which perform boundary checks (#3293) · 4f5c3aa2

David Min authored Oct 15, 2021



* Add pytorch-direct version

* remove

* add documentation for UnifiedTensor

* Revert "add documentation for UnifiedTensor"

This reverts commit 63ba42644d4aba197c1cb4ea4b85fa1bc43b8849.

* add boundary check for UVM IndexSelect

* relocate boundary check index kernels to cuda

* fix function name

* fix indexkernel in nccl api

* fix argument ordering

* simplify code

* Add a comment for the uvm version
Co-authored-by: shhssdm <shhssdm@gmail.com>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

4f5c3aa2

14 Oct, 2021 1 commit

[Bugfix] three bugs related to using DGL as a subdirectory(third_party) of another project. (#3379) · 18863069

zexi yuan authored Oct 14, 2021

* [Bugfix] fix a compile error for Debug-BuildType on Windows Platform

When using CMakeLists.txt to build the "Debug" BuildType on the Windows Platform, it has three compile errors (C4716) in the file "dgl\src\runtime\shared_mem.cc":

'dgl::runtime::SharedMemory::CreateNew': must return a value
'dgl::runtime::SharedMemory::Open': must return a value
'dgl::runtime::SharedMemory::Exist': must return a value

* [Bugfix] cmake error "cannot find load file" when DGL as a sub_directory on Linux

When using DGL as a subdirectory in a CMake Project, the "CMAKE_SOURCE_DIR" here will return the parent cmake scope dir, which is not a expected dir.
Maybe it is better to use "CMAKE_CURRENT_SOURCE_DIR" to set "GKLIB_PATH".

* [Bugfix] cmd cmake error when DGL as a subdirectory

When DGL as a subdirectory of another project, the WORKING_DIRECTORY of "add_custom_command" will be incorrect at the line 255 of "CMakeLists.txt", such that making a cmake "setlocal" error.

18863069

29 Sep, 2021 1 commit

[Feature] enable create/set/free cuda stream for internal use (#3334) · e234fcfa

Rhett Ying authored Sep 29, 2021

* [Feature] enable create/set/free cuda stream for internal use

* add unit test

* fix unit test failure on mxnet and tf

* refactor stream wrapper

* fix lint error

* fix lint error

e234fcfa

28 Sep, 2021 1 commit
- [Feature] Implement one thread multiple socket (#3200) · 5cf48fc6
  Jingcheng Yu authored Sep 28, 2021
```
Co-authored-by: JingchengYu94 <jingchengyu94@gmail.com>
```
  5cf48fc6
06 Sep, 2021 1 commit
- Remove deprecated kernels (#3316) · c81efdf2
  Jinjing Zhou authored Sep 06, 2021
```
* remove

* remove

* fix

* remove

* remove
```
  c81efdf2
01 Sep, 2021 1 commit
- [Feature] enable to specify stream in UnitGraph::CopyTo() which could lead to async copy (#3297) · 5a245104
  Rhett Ying authored Sep 01, 2021
```
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
```
  5a245104
19 Aug, 2021 1 commit

[Performance][Feature] Implement edge excluding in EdgeDataLoader on GPU (#3226) · f6349508

nv-dlasalle authored Aug 19, 2021



* Update filter code

* Add unit tests

* Fixes

* Switch to indices

* Rename functions

* Fix linting

* Fix whitespace

* Add doc

* Fix heterograph

* Change workspace allocation

* Fix linting

* Fix docs in filter.py

* Add todo
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

f6349508

08 Aug, 2021 1 commit
- [bugfix] Only link the pytorch tensordispatcher against libtorch (#3225) · 799c091e
  nv-dlasalle authored Aug 07, 2021
```
* Only link tensordispatcher against pytorch

* Only modify libraries when not using MSVC
```
  799c091e
16 Jul, 2021 1 commit

[Feature][Performance][GPU] Introducing UnifiedTensor for efficient zero-copy... · 905c0aa5

David Min authored Jul 17, 2021

[Feature][Performance][GPU] Introducing UnifiedTensor for efficient zero-copy host memory access from GPU (#3086)

* Add pytorch-direct version

* Initial commit of unified tensor

* Merge branch 'master' of https://github.com/davidmin7/dgl



* Remove unnecessary things

* Fix error message

* Fix/Add descriptions

* whitespace fix

* add unpin

* disable IndexSelectCPUFromGPU with no CUDA

* add a newline for unified_tensor.py

* Apply changes based on feedback

* add 'os' module

* skip unified tensor unit test for cpu only

* Update tests/pytorch/test_unified_tensor.py
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

* reflect feedback
Co-authored-by: shhssdm <shhssdm@gmail.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

905c0aa5

08 Jul, 2021 1 commit
- [Build] fix various build problems (#3117) · 2f41fcd9
  Quan (Andy) Gan authored Jul 08, 2021
  
  2f41fcd9
02 Jul, 2021 1 commit

[Feature] Add dgl.utils.is_sorted_srcdst() (#2685) · a0390dde

nv-dlasalle authored Jul 01, 2021



* Add dgl.utils.is_sorted_srcdst

* Fix linting issues

* delete blank line

* Specify datatype to index tensor in test

* Force integer conversion
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

a0390dde

27 Jun, 2021 1 commit

[Build] Make nccl optional (#3056) · 9664cdff

Jinjing Zhou authored Jun 27, 2021

* fix

* remove nvidiasmi

* fix

* fix docs

* fix

* fix

* 1

* fix

* remove

* skip deprecated kernel

* fix

* Revert "skip deprecated kernel"

This reverts commit c5ceb7f60dbbaf065b81cc3680757fd611d90ad3.

* fix

9664cdff

23 Jun, 2021 1 commit
- [Bugfix] Handle case where process has no elements to update, in NCCL communicator (#3035) · 7415eaa5
  nv-dlasalle authored Jun 23, 2021
```
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>
```
  7415eaa5
11 Jun, 2021 2 commits

[CPU, Parallel] parallel_for with default grain size (#3004) · 411bef54
Tomasz Patejko authored Jun 11, 2021

411bef54

[Feature] Allow using NCCL for communication in dgl.NodeEmbedding and dgl.SparseOptimizer (#2824) · 17d604b5

nv-dlasalle authored Jun 10, 2021



* Split from NCCL PR

* Fix type in comment

* Expand documentation for sparse_all_to_all_push

* Restore previous behavior in example

* Re-work optimizer to use NCCL based on gradient location

* Allow for running with embedding on CPU but using NCCL for gradient exchange

* Optimize single partition case

* Fix pylint errors

* Add missing include

* fix gradient indexing

* Fix line continuation

* Migrate 'first_step'

* Skip tests without enough GPUs to run NCCL

* Improve empty tensor handling for pytorch 1.5

* Fix indentation

* Allow multiple NCCL communicator to coexist

* Improve handling of empty message

* Update python/dgl/nn/pytorch/sparse_emb.py
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

* Update python/dgl/nn/pytorch/sparse_emb.py
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

* Keepy empty tensor dimensionaless

* th.empty -> th.tensor

* Preserve shape for empty non-zero dimension tensors

* Use shared state, when embedding is shared

* Add support for gathering an embedding

* Fix typo

* Fix more typos

* Fix backend call

* Use NodeDataLoader to take advantage of ddp

* Update training script to share memory

* Only squeeze last dimension

* Better handle empty message

* Keep embedding on the target device GPU if dgl_sparse if false in RGCN example

* Fix typo in comment

* Add asserts

* Improve documentation in example
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

17d604b5

20 May, 2021 1 commit

[Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings... · ae8dbe6d

nv-dlasalle authored May 20, 2021


[Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings and sparse gradients. (#2825)

* Split NCCL wrapper from sparse optimizer and sparse embedding

* Add more unit tests for single node nccl

* Fix unit test for tf

* Switch to device histogram

* Fix histgram issues

* Finish migration to histogram

* Handle cases with zero send/recieve data

* Start on partition object

* Get compiling

* Updates

* Add unit tests

* Switch to partition object

* Fix linting issues

* Rename partition file

* Add python doc

* Fix python assert and finish doxygen comments

* Remove stubs for range based partition to satisfy pylint

* Wrap unit test in GPU only

* Wrap explicit cuda call in ifdef

* Merge with partition.py

* update docstrings

* Cleanup partition_op

* Add Workspace object

* Switch to using workspace object

* Move last remainder based function out of nccl_api

* Add error messages

* Update docs with examples

* Fix linting erros
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

ae8dbe6d

27 Apr, 2021 1 commit

[Feature] Add cuda support for Sparse Matrix multiplication, summation and masking (#2782) · ab2bd1f1

Israt Nisa authored Apr 27, 2021



* init cuda support

* cuSPARSE err

* passed unittest for csr_mm/SpGEMM. int64 not supported

* Debugging cuSPARSE error 3

* csrgeam only supports int32?

* disabling int64 for cuda

* refactor and add CSRMask

* lint

* oops

* remove todo

* rewrite CSRMask with CSRGetData

* lint

* fix test

* address comments

* lint

* fix

* addresses comments and rename BUG_ON
Co-authored-by: Israt Nisa <nisisrat@amazon.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-30-71.ec2.internal>
Co-authored-by: Quan Gan <coin2028@hotmail.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

ab2bd1f1

22 Mar, 2021 1 commit

[Bugfix] Wrap cub with CUB_NS_PREFIX and remove dependency on Thrust to... · 0ff7127a

nv-dlasalle authored Mar 22, 2021


[Bugfix] Wrap cub with CUB_NS_PREFIX and remove dependency on Thrust to linking issues with Torch 1.8 (#2758)

* Wrap cub with prefixes and remove thrust

* Using counting iterator
Co-authored-by: Zihao Ye <expye@outlook.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

0ff7127a

09 Mar, 2021 1 commit

[Feature] Add edge coarsening for homogeneous undirected graphs (#2691) · c88fca50

Tianqi Zhang (张天启) authored Mar 09, 2021



* finish graph matching gpu version

* use C++ shuffle

* finish graph matching

* fix bug

* fix bug

* change name and use swap

* upt

* fix format problem

* fix format problem

* stronger test

* upt

* upt

* change python api

* upt

* upt

* format check

* upt

* upt

* fix bug
Co-authored-by: Tong He <hetong007@gmail.com>

c88fca50

08 Feb, 2021 1 commit

[Sampling] Implement `dgl.to_block()` for the GPU (#2339) · bc3a532f

nv-dlasalle authored Feb 07, 2021



* Add start of to_block gpu implementation

* Pull in more changes from 0.4.2 cuda_to_block

* Move more code to IdArray

* Refactor DeviceNodeMapMaker

* Updates

* get compiling

* Integrate to_block

* Fix ID allocation

* Minor fixes

* Cleanup cuda calls to use cuda_common

* Reduce kernel calls

* Lint cleanup

* Expand documentation

* Remove unused function

* Rename variables for consistency

* Add doxygen comments

* Fix file extension

* Remove raw asynccopy for deviceapi

* Remove unused function

* Fix block/tile configuration

* Add cuda_device_common.cuh

* Add basic hashtable

* Migrate part of hashtable

* Refactor to use external hashtable

* Make functions members

* Format hash table functions

* Migrate duplicate filling

* Move last function over

* Refactor with cu file

* lint c++ code

* Move context check to C++ code

* Use macro switch

* Add missing files

* Update docstring

* update docs

* Move atomic functions

* Refactor hashtable

* Fix linting

* Expand docs

* Fix mismatched argument names

* Switch doxygen comments from using @param to \param
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

bc3a532f

29 Jan, 2021 1 commit
- fix build problems (#2594) · 460bb42d
  Quan (Andy) Gan authored Jan 29, 2021
  
  460bb42d
28 Jan, 2021 1 commit

[feature] Supporting half precision floating data type (fp16). (#2552) · 7bab1365

Zihao Ye authored Jan 28, 2021



* add tvm as submodule

* compilation is ok but calling fails

* can call now

* pack multiple modules, change names

* upd

* upd

* upd

* fix cmake

* upd

* upd

* upd

* upd

* fix

* relative path

* upd

* upd

* upd

* singleton

* upd

* trigger

* fix

* upd

* count reducible

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* only keep related files

* upd

* upd

* upd

* upd

* lint

* lint

* lint

* lint

* pylint

* upd

* upd

* compilation

* fix

* upd

* upd

* upd

* upd

* upd

* upd

* upd doc

* refactor

* fix

* upd number
Co-authored-by: Zhi Lin <linzhilynn@gmail.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-42-78.us-east-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-21-156.us-east-2.compute.internal>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

7bab1365

27 Jan, 2021 1 commit

[Feature] Add support for sparse embedding (#2451) · a7e941c3

xiang song(charlie.song) authored Jan 28, 2021



* Add sparse embedding for dgl and update rgcn example

* upd

* Fix

* Revert "Fix"

This reverts commit 4da87cdfb8b8c3506b7fc7376cd2385ba8045c2a.

* Fix

* upd

* upd

* Fix

* Add unitest and update impl

* fix

* Clean up rgcn example code

* upd

* upd

* update

* Fix

* update score

* sparse for sage

* remove model sparse

* upd

* upd

* remove global norm

* revert delete model_sparse.py

* update according to comments

* Fix doc

* upd

* Fix test

* upd

* lint

* lint

* lint

* upd

* upd

* clean up
Co-authored-by: Ubuntu <ubuntu@ip-172-31-56-220.ec2.internal>

a7e941c3

25 Jan, 2021 1 commit

[Distributed] Heterogeneous graph support (#2457) · 25ac3344

Da Zheng authored Jan 24, 2021



* Distributed heterograph (#3)

* heterogeneous graph partition.

* fix graph partition book for heterograph.

* load heterograph partitions.

* update DistGraphServer to support heterograph.

* make DistGraph runnable for heterograph.

* partition a graph and store parts with homogeneous graph structure.

* update DistGraph server&client to use homogeneous graph.

* shuffle node Ids based on node types.

* load mag in heterograph.

* fix per-node-type mapping.

* balance node types.

* fix for homogeneous graph

* store etype for now.

* fix data name.

* fix a bug in example.

* add profiler in rgcn.

* heterogeneous RGCN.

* map homogeneous node ids to hetero node ids.

* fix graph partition book.

* fix DistGraph.

* shuffle eids.

* verify eids and their mappings when loading a partition.

* Id map from homogneous Ids to per-type Ids.

* verify partitioned results.

* add test for distributed sampler.

* add mapping from per-type Ids to homogeneous Ids.

* update example.

* fix DistGraph.

* Revert "add profiler in rgcn."

This reverts commit 36daaed8b660933dac8f61a39faec3da2467d676.

* add tests for homogeneous graphs.

* fix a bug.

* fix test.

* fix for one partition.

* fix for standalone training and evaluation.

* small fix.

* fix two bugs.

* initialize projection matrix.

* small fix on RGCN.

* Fix rgcn performance (#17)
Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-171.ec2.internal>

* fix lint.

* fix lint.

* fix lint.

* fix lint.

* fix lint.

* fix lint.

* fix.

* fix test.

* fix lint.

* test partitions.

* remove redundant test for partitioning.

* remove commented code.

* fix partition.

* fix tests.

* fix RGCN.

* fix test.

* fix test.

* fix test.

* fix.

* fix a bug.

* update dmlc-core.

* fix.

* fix rgcn.

* update readme.

* add comments.
Co-authored-by: Ubuntu <ubuntu@ip-172-31-2-202.us-west-1.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-132.us-west-1.compute.internal>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-171.ec2.internal>

* fix.

* fix.

* add div_int.

* fix.

* fix.

* fix lint.

* fix.

* fix.

* fix.

* adjust.

* move code.

* handle heterograph.

* return pytorch tensor in GPB.

* remove some tests in example.

* add to_block for distributed training.

* use distributed to_block.

* remove unnecessary function in DistGraph.

* remove distributed to_block.

* use pytorch tensor.

* fix a bug in ntypes and etypes.

* enable norm.

* make the data loader compatible with the old format.

* fix.

* add comments.

* fix a bug.

* add test for heterograph.

* support partition without reshuffle.

* add test.

* support partition without reshuffle.

* fix.

* add test.

* fix bugs.

* fix lint.

* fix dataset.

* fix for mxnet.

* update docstring.

* rename to floor_div

* avoid exposing NodePartitionPolicy and EdgePartitionPolicy.

* fix docstring.

* fix error.

* fixes.

* fix comments.

* rename.

* rename.

* explain IdMap.

* fix docstring.

* fix docstring.

* update docstring.

* remove the code of returning heterograph.

* remove argument.

* fix example.

* make GraphPartitionBook an abstract class.

* fix.

* fix.

* fix a bug.

* fix a bug in example

* fix a bug

* reverse heterograph sampling.

* temp fix.

* fix lint.

* Revert "temp fix."

This reverts commit c450717b9f578b8c48769c675f2a19d6c1e64381.

* compute norm.

* Revert "reverse heterograph sampling."

This reverts commit bd6deb7f52998de76508f800441ff518e2fadcb9.

* fix.

* move id_map.py

* remove check

* add more comments.

* update docstring.
Co-authored-by: Ubuntu <ubuntu@ip-172-31-2-202.us-west-1.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-132.us-west-1.compute.internal>
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-171.ec2.internal>

25ac3344

14 Jan, 2021 1 commit

[Bug] Fix munmap() using wrong parameter (#2519) · 4d89b54e

Quan (Andy) Gan authored Jan 14, 2021



* fix munmap() using wrong parameter

* rename variables
Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

4d89b54e

26 Dec, 2020 2 commits

delete shared memory when receiving signals. (#2419) · 3d1f2e87

Da Zheng authored Dec 26, 2020



* delete shared memory when receive signal.

* rename.

* fix lint.

* fix lint.

* fix compile.

* Fix.

* we need to report error if the shared memory exist.

* disable tensorflow test for shared memory.

* revert.
Co-authored-by: Ubuntu <ubuntu@ip-172-31-2-202.us-west-1.compute.internal>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

3d1f2e87

addressing post-merge comments (#2455) · e8054701
Quan (Andy) Gan authored Dec 26, 2020

e8054701

25 Dec, 2020 1 commit

[Performance] Use allocator from PyTorch if possible (#2328) · 9a7235fa

Quan (Andy) Gan authored Dec 25, 2020

* first commit

* some thoughts

* move around

* more commit

* more fixes

* now it uses torch allocator

* fix symbol export error

* fix

* fixes

* test fix

* add script

* building separate library per version

* fix for vs2019

* more fixes

* fix on windows build

* update jenkinsfile

* auto copy built dlls for windows

* lint and installation guide update

* fix

* specify conda environment

* set environment for ci

* fix

* fix

* fix

* fix again

* revert

* fix cmake

* fix

* switch to using python interpreter path

* remove scripts

* debug

* oops sorry

* Update index.rst

* Update index.rst

* copies automatically, no need for this

* do not print message if library not found

* tiny fixes

* debug on nightly

* replace add_compile_definitions to make CMake 3.5 happy

* fix linking to wrong lib for multiple pytorch envs

* changed building strategy

* fix nightly

* fix windows

* fix windows again

* setup bugfix

* address comments

* change README

9a7235fa

02 Nov, 2020 1 commit
- [Doc][Dataloading] Expand documentation of AsyncTransferer (#2313) · d453d72d
  nv-dlasalle authored Nov 02, 2020
```
* Update docs

* Make non-default streams non-blocking
```
  d453d72d
10 Sep, 2020 2 commits

[performance] Batch DGLGraph in C++ end. (#2155) · cbd55eb1

Zihao Ye authored Sep 11, 2020



* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* fix

* upd
Co-authored-by: VoVAllen <jz1749@nyu.edu>

cbd55eb1

[hotfix] Skip CUDA kernel launch when number of blocks/threads is zero. (#2144) · 2c04ecb5
Zihao Ye authored Sep 10, 2020
```
* upd

* upd

* upd

* upd

* lint

* upd

* upd

* fmt
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>
```
2c04ecb5

30 Aug, 2020 1 commit
- [hotfix] IsContiguous failed when tensor's size on first dimension is 1. (#2127) · 7816c5a2
  Zihao Ye authored Aug 30, 2020
```
* udp

* fix
Co-authored-by: Quan Gan <coin2028@hotmail.com>
```
  7816c5a2
11 Aug, 2020 1 commit

[Fix] OpenMP compatibility issues (#1992) · 71f4230a

chwan-rice authored Aug 11, 2020



* fix OpenMP compatibility issues

* typo
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

71f4230a

09 Aug, 2020 1 commit

[Distributed] Set the number of threads correctly to speed up (#1976) · b9ef70e5

Da Zheng authored Aug 08, 2020

* temp fix omp.

* set server threads.

* add CAPI to set up OMP threads.

* fix.

* fix.

* update namesapce.

* set cpi properly.

* allow to config num worker threads.

* set #threads.

* fix.

b9ef70e5

28 Jul, 2020 1 commit

[Feature] Shared memory utilities (#1807) · 40950629

Qidong Su authored Jul 28, 2020



* update

* update

* update

* update

* fix

* update

* fix

* update

* update

* win32

* update

* fix

* update

* update

* update

* updat

* update

* update

* fix

* update

* update

* update

* update

* update

* fix

* TODO

* 111

* fix

* minor fix

* minor fix

* fox

* Update shared_mem_manager.cc

* update

* update

* update

* update metis

* update metis

* update
Co-authored-by: VoVAllen <jz1749@nyu.edu>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

40950629

28 Jun, 2020 1 commit

[CUDA][Kernel] More CUDA kernels; Standardize the behavior for sorted COO/CSR (#1704) · 870da747

Minjie Wang authored Jun 28, 2020

* add cub; array cumsum

* CSRSliceRows

* fix warning

* operator << for ndarray; CSRSliceRows

* add CSRIsSorted

* add csr_sort

* inplace coosort and outplace csrsort

* WIP: coo is sorted

* mv cuda_utils

* add AllTrue utility

* csr sort

* coo sort

* coo2csr for sorted coo arrays

* CSRToCOO from sorted

* pass tests for the new kernel changes

* cannot use inplace sort

* lint

* try fix msvc error

* Fix g.copy_to and g.asnumbits; ToBlock no longer uses CSC

* stash

* revert some hack

* revert some changes

* address comments

* fix

* fix to_block unittest

* add todo note

870da747