- 06 Nov, 2022 1 commit
-
-
Hongzhi (Steve), Chen authored
* param * brief * note * return * tparam * brief2 * file * return2 * return * blabla * all Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
- 04 Nov, 2022 1 commit
-
-
Hongzhi (Steve), Chen authored
* [Misc] clang-format auto fix. * fix * manual Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
- 19 Sep, 2022 1 commit
-
-
Xin Yao authored
* rename `DLContext` to `DGLContext` * rename `kDLGPU` to `kDLCUDA` * replace DLTensor with DGLArray * fix linting * Unify DGLType and DLDataType to DGLDataType * Fix FFI * rename DLDeviceType to DGLDeviceType * decouple dlpack from the core library * fix bug * fix lint * fix merge * fix build * address comments * rename dl_converter to dlpack_convert * remove redundant comments
-
- 15 Sep, 2022 1 commit
-
-
Xin Yao authored
* add set_stream * add .record_stream for NDArray and HeteroGraph * refactor dgl stream Python APIs * test record_stream * add unit test for record stream * use pytorch's stream * fix lint * fix cpu build * address comments * address comments * add record stream tests for dgl.graph * record frames and update dataloder * add docstring * update frame * add backend check for record_stream * remove CUDAThreadEntry::stream * record stream for newly created formats * fix bug * fix cpp test * fix None c_void_p to c_handle
-
- 06 Sep, 2022 1 commit
-
-
Chang Liu authored
* Use an internal cuda stream for CopyDataFromTo * small fix white space * Fix to compile * Make stream optional in copydata for compile * fix lint issue * Update cub functions to use internal stream * Lint check * Update CopyTo/CopyFrom/CopyFromTo to use internal stream * Address comments * Fix backward CUDA stream * Avoid overloading CopyFromTo() * Minor comment update * Overload copydatafromto in cuda device api Co-authored-by:xiny <xiny@nvidia.com>
-
- 29 Jun, 2022 1 commit
-
-
nv-dlasalle authored
* Update nccl communicator for when NCCL is missing * Use static_cast * Add doc string * Fix whitespace * Resrtict unit test to GPU runs Co-authored-by:Xin Yao <xiny@nvidia.com>
-
- 11 Jun, 2022 1 commit
-
-
Xin Yao authored
* Wrap all CUDA runtime API/CUB calls with macro * remove the usage of explicit cudaMalloc in favor of AllocWorkspace * fix typo Co-authored-by:Israt Nisa <neesha295@gmail.com>
-
- 06 Jun, 2022 1 commit
-
-
Xin Yao authored
Co-authored-by:
nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by:
Israt Nisa <neesha295@gmail.com>
-
- 12 May, 2022 1 commit
-
-
nv-dlasalle authored
-
- 18 Oct, 2021 1 commit
-
-
nv-dlasalle authored
-
- 15 Oct, 2021 1 commit
-
-
David Min authored
* Add pytorch-direct version * remove * add documentation for UnifiedTensor * Revert "add documentation for UnifiedTensor" This reverts commit 63ba42644d4aba197c1cb4ea4b85fa1bc43b8849. * add boundary check for UVM IndexSelect * relocate boundary check index kernels to cuda * fix function name * fix indexkernel in nccl api * fix argument ordering * simplify code * Add a comment for the uvm version Co-authored-by:
shhssdm <shhssdm@gmail.com> Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com> Co-authored-by:
Minjie Wang <wmjlyjemaine@gmail.com>
-
- 06 Sep, 2021 1 commit
-
-
Jinjing Zhou authored
* remove * remove * fix * remove * remove
-
- 27 Jun, 2021 1 commit
-
-
Jinjing Zhou authored
* fix * remove nvidiasmi * fix * fix docs * fix * fix * 1 * fix * remove * skip deprecated kernel * fix * Revert "skip deprecated kernel" This reverts commit c5ceb7f60dbbaf065b81cc3680757fd611d90ad3. * fix
-
- 23 Jun, 2021 1 commit
-
-
nv-dlasalle authored
Co-authored-by:Minjie Wang <wmjlyjemaine@gmail.com>
-
- 11 Jun, 2021 1 commit
-
-
nv-dlasalle authored
* Split from NCCL PR * Fix type in comment * Expand documentation for sparse_all_to_all_push * Restore previous behavior in example * Re-work optimizer to use NCCL based on gradient location * Allow for running with embedding on CPU but using NCCL for gradient exchange * Optimize single partition case * Fix pylint errors * Add missing include * fix gradient indexing * Fix line continuation * Migrate 'first_step' * Skip tests without enough GPUs to run NCCL * Improve empty tensor handling for pytorch 1.5 * Fix indentation * Allow multiple NCCL communicator to coexist * Improve handling of empty message * Update python/dgl/nn/pytorch/sparse_emb.py Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com> * Update python/dgl/nn/pytorch/sparse_emb.py Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com> * Keepy empty tensor dimensionaless * th.empty -> th.tensor * Preserve shape for empty non-zero dimension tensors * Use shared state, when embedding is shared * Add support for gathering an embedding * Fix typo * Fix more typos * Fix backend call * Use NodeDataLoader to take advantage of ddp * Update training script to share memory * Only squeeze last dimension * Better handle empty message * Keep embedding on the target device GPU if dgl_sparse if false in RGCN example * Fix typo in comment * Add asserts * Improve documentation in example Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com>
-
- 20 May, 2021 1 commit
-
-
nv-dlasalle authored
[Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings and sparse gradients. (#2825) * Split NCCL wrapper from sparse optimizer and sparse embedding * Add more unit tests for single node nccl * Fix unit test for tf * Switch to device histogram * Fix histgram issues * Finish migration to histogram * Handle cases with zero send/recieve data * Start on partition object * Get compiling * Updates * Add unit tests * Switch to partition object * Fix linting issues * Rename partition file * Add python doc * Fix python assert and finish doxygen comments * Remove stubs for range based partition to satisfy pylint * Wrap unit test in GPU only * Wrap explicit cuda call in ifdef * Merge with partition.py * update docstrings * Cleanup partition_op * Add Workspace object * Switch to using workspace object * Move last remainder based function out of nccl_api * Add error messages * Update docs with examples * Fix linting erros Co-authored-by:xiang song(charlie.song) <classicxsong@gmail.com>
-