- 29 Feb, 2024 1 commit
-
-
Muhammed Fatih BALIN authored
-
- 17 May, 2023 1 commit
-
-
nv-dlasalle authored
[Performance Improvement] Make GPU sampling and to_block use pinned memory to decrease required synchronization (#5685)
-
- 09 Dec, 2022 1 commit
-
-
Xin Yao authored
* fix empty tensor is treated as pinned * avoid calling cudaHostGetDevicePointer on nullptr * update empty array * add a comment
-
- 07 Nov, 2022 1 commit
-
-
Hongzhi (Steve), Chen authored
* replace * blabla * balbla * blabla Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
- 06 Nov, 2022 1 commit
-
-
Hongzhi (Steve), Chen authored
* param * brief * note * return * tparam * brief2 * file * return2 * return * blabla * all Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
- 03 Nov, 2022 2 commits
-
-
Hongzhi (Steve), Chen authored
* [Misc] clang-format auto fix. * manual * manual * manual * manual * todo * fix Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
Xin Yao authored
* get device pointers * change if condition to IsPinned
-
- 28 Oct, 2022 1 commit
-
-
Quan (Andy) Gan authored
* sample neighbors with masks * oops * refactor again * remove * remove debug code * rename macro * address comments * address comment * address comments * rename a lot of stuff * oops
-
- 19 Sep, 2022 1 commit
-
-
Xin Yao authored
* rename `DLContext` to `DGLContext` * rename `kDLGPU` to `kDLCUDA` * replace DLTensor with DGLArray * fix linting * Unify DGLType and DLDataType to DGLDataType * Fix FFI * rename DLDeviceType to DGLDeviceType * decouple dlpack from the core library * fix bug * fix lint * fix merge * fix build * address comments * rename dl_converter to dlpack_convert * remove redundant comments
-
- 15 Sep, 2022 1 commit
-
-
Xin Yao authored
* add set_stream * add .record_stream for NDArray and HeteroGraph * refactor dgl stream Python APIs * test record_stream * add unit test for record stream * use pytorch's stream * fix lint * fix cpu build * address comments * address comments * add record stream tests for dgl.graph * record frames and update dataloder * add docstring * update frame * add backend check for record_stream * remove CUDAThreadEntry::stream * record stream for newly created formats * fix bug * fix cpp test * fix None c_void_p to c_handle
-
- 06 Sep, 2022 1 commit
-
-
Chang Liu authored
* Use an internal cuda stream for CopyDataFromTo * small fix white space * Fix to compile * Make stream optional in copydata for compile * fix lint issue * Update cub functions to use internal stream * Lint check * Update CopyTo/CopyFrom/CopyFromTo to use internal stream * Address comments * Fix backward CUDA stream * Avoid overloading CopyFromTo() * Minor comment update * Overload copydatafromto in cuda device api Co-authored-by:xiny <xiny@nvidia.com>
-
- 29 Jul, 2022 1 commit
-
-
Xin Yao authored
* add weighted sampling without replacement (A-Chao) * improve Algorithm A-Chao with block-wise prefix sum * correctly fill out_idxs * implement weighted sampling with replacement * small fix * merge host-side code of weighted/uniform sampling * enable unit tests for cuda weighted sampling * move thrust/cub wrapper to the cmake file * update docs accordingly * fix linting * fix linting * fix unit test * Bump external CUB/Thrust versions * Fix code style and update description of algorithm design * [Feature] GPU support weighted graph neighbor sampling commit by pengqirong(OPPO) * merge pengqirong's implementation * revert the change to cub and thrust * fix linting * use DeviceSegmentedSort for better performance * add more comments * add necessary notes * add necessary notes * resolve some comments * define THRUST_CUB_WRAPPED_NAMESPACE * fix doc Co-authored-by:彭齐荣 <657017034@qq.com>
-
- 17 May, 2022 1 commit
-
-
paoxiaode authored
* Change the curand_init parameter * Change the curand_init parameter * commit * commit * change the curandState and launch dim of CSRRowwiseSample kernel * commit * keep _CSRRowWiseSampleReplaceKernel in sync Co-authored-by:nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
-
- 10 Mar, 2022 1 commit
-
-
paoxiaode authored
* Change the curand_init parameter * Change the curand_init parameter * commit * commit Co-authored-by:nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
-
- 09 Feb, 2022 1 commit
-
-
Xin Yao authored
* implement pin_memory/unpin_memory/is_pinned for dgl.graph * update python docstring * update c++ docstring * add test * fix the broken UnifiedTensor * XPU_SWITCH for kDLCPUPinned * a rough version ready for testing * eliminate extra context parameter for pin/unpin * update train_sampling * fix linting * fix typo * multi-gpu uva sampling case * disable new format materialization for pinned graphs * update python doc for pin_memory_ * fix unit test * UVA sampling for link prediction * dispatch most csr ops * update graphsage example to combine uva sampling and UnifiedTensor * update graphsage example to combine uva sampling and UnifiedTensor * update graphsage example to combine uva sampling and UnifiedTensor * update doc * update examples * change unitgraph and heterograph's PinMemory to in-place * update examples for multi-gpu uva sampling * update doc * fix linting * fix cpu build * fix is_pinned for DistGraph * fix is_pinned for DistGraph * update graphsage unsupervised example * update doc for gpu sampling * update some check for sampling device switching * fix linting * adapt for new dataloader * fix linting * fix * fix some name issue * adjust device check * add unit test for uva sampling & fix some zero_copy bug * fix linting * update num_threads in graphsage examples Co-authored-by:
Quan (Andy) Gan <coin2028@hotmail.com> Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 06 Sep, 2021 1 commit
-
-
Jinjing Zhou authored
* remove * remove * fix * remove * remove
-
- 02 Aug, 2021 1 commit
-
-
nv-dlasalle authored
* Split out separate generators for each thread * Amortize cost of curand_init * Improve readability Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com> Co-authored-by:
Quan (Andy) Gan <coin2028@hotmail.com>
-
- 08 Jul, 2021 1 commit
-
-
Jinjing Zhou authored
-
- 15 Apr, 2021 1 commit
-
-
nv-dlasalle authored
* Start on uniform GPU sampling * Save more work * Get cu file compiling * Update sampling * More changes * Get GPU sampling for uniform probabilities solved * Fix batch tensor migration * Fix * update kernels * expand blocking * Undo testing change * Cut down on sampling overhead * Fix replacement * Update unit tests * Add option to gpu sample in graphsage * Copy only csc to gpu * Add ogbn support * Fix linting * Remove nvtx from sample * Improve documentation and error checking * Expand documentation * Update assert checking * delete extra space * Use standard dataloader when dataset is a dictionary * ogb -> ogbn * Fix edge selection determinism * Fix typos * Remove nvtx * Add comment for self.fanout_arrays and assert * Fix linting * Migrate to scalarbatcher * Fix indentation * Fix batcher * Fix indexing * Only use databatcher for GPU * Convert to DGL NDArray to PyTorch Tensor * Add optimization for PyTorch's F.tensor() for list of GPU tensors Co-authored-by:Da Zheng <zhengda1936@gmail.com>
-