Commits · 5185c522b0acb798ed8ebb9084510bbe5b58e73b · OpenDAS / dgl

05 Dec, 2023 1 commit
- [Misc] Eliminate compile warnings (#6685) · 5185c522
  Muhammed Fatih BALIN authored Dec 04, 2023
  
  5185c522
04 Nov, 2022 1 commit

[Misc] clang-format auto fix. (#4812) · 33a2d9e1

Hongzhi (Steve), Chen authored Nov 04, 2022



* [Misc] clang-format auto fix.

* manual
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

33a2d9e1

19 Sep, 2022 1 commit

[Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) · cded5b80

Xin Yao authored Sep 19, 2022

* rename `DLContext` to `DGLContext`

* rename `kDLGPU` to `kDLCUDA`

* replace DLTensor with DGLArray

* fix linting

* Unify DGLType and DLDataType to DGLDataType

* Fix FFI

* rename DLDeviceType to DGLDeviceType

* decouple dlpack from the core library

* fix bug

* fix lint

* fix merge

* fix build

* address comments

* rename dl_converter to dlpack_convert

* remove redundant comments

cded5b80

20 Aug, 2021 1 commit

[Feature][DistDGL] Add NCCL support for range based partitions (#3213) · 7f927939

nv-dlasalle authored Aug 19, 2021

* Implement range based NDArrayPartition

* Finish implement range based partition support

* Add unit test

* Fix whitepace

* Add Kernel suffix

* Fix argument passing

* Add doxygen docs and improve variable naming

* Add unit test

* Add function for converting a partition book

* Add example to partition_op docs

* Fix dtype conversion for mxnet and tensorflow

7f927939

11 Jun, 2021 1 commit

[Feature] Allow using NCCL for communication in dgl.NodeEmbedding and dgl.SparseOptimizer (#2824) · 17d604b5

nv-dlasalle authored Jun 10, 2021



* Split from NCCL PR

* Fix type in comment

* Expand documentation for sparse_all_to_all_push

* Restore previous behavior in example

* Re-work optimizer to use NCCL based on gradient location

* Allow for running with embedding on CPU but using NCCL for gradient exchange

* Optimize single partition case

* Fix pylint errors

* Add missing include

* fix gradient indexing

* Fix line continuation

* Migrate 'first_step'

* Skip tests without enough GPUs to run NCCL

* Improve empty tensor handling for pytorch 1.5

* Fix indentation

* Allow multiple NCCL communicator to coexist

* Improve handling of empty message

* Update python/dgl/nn/pytorch/sparse_emb.py
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

* Update python/dgl/nn/pytorch/sparse_emb.py
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

* Keepy empty tensor dimensionaless

* th.empty -> th.tensor

* Preserve shape for empty non-zero dimension tensors

* Use shared state, when embedding is shared

* Add support for gathering an embedding

* Fix typo

* Fix more typos

* Fix backend call

* Use NodeDataLoader to take advantage of ddp

* Update training script to share memory

* Only squeeze last dimension

* Better handle empty message

* Keep embedding on the target device GPU if dgl_sparse if false in RGCN example

* Fix typo in comment

* Add asserts

* Improve documentation in example
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

17d604b5

20 May, 2021 1 commit

[Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings... · ae8dbe6d

nv-dlasalle authored May 20, 2021


[Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings and sparse gradients. (#2825)

* Split NCCL wrapper from sparse optimizer and sparse embedding

* Add more unit tests for single node nccl

* Fix unit test for tf

* Switch to device histogram

* Fix histgram issues

* Finish migration to histogram

* Handle cases with zero send/recieve data

* Start on partition object

* Get compiling

* Updates

* Add unit tests

* Switch to partition object

* Fix linting issues

* Rename partition file

* Add python doc

* Fix python assert and finish doxygen comments

* Remove stubs for range based partition to satisfy pylint

* Wrap unit test in GPU only

* Wrap explicit cuda call in ifdef

* Merge with partition.py

* update docstrings

* Cleanup partition_op

* Add Workspace object

* Switch to using workspace object

* Move last remainder based function out of nccl_api

* Add error messages

* Update docs with examples

* Fix linting erros
Co-authored-by: xiang song(charlie.song) <classicxsong@gmail.com>

ae8dbe6d