- 01 Mar, 2025 2 commits
- 20 Feb, 2025 1 commit
-
-
sangwz authored
-
- 16 Oct, 2024 1 commit
-
-
sangwzh authored
-
- 15 Oct, 2024 1 commit
-
-
sangwzh authored
-
- 25 Sep, 2024 1 commit
-
-
sangwzh authored
-
- 23 Sep, 2024 1 commit
-
-
sangwzh authored
-
- 13 Sep, 2024 1 commit
-
-
sangwzh authored
-
- 20 Apr, 2024 1 commit
-
-
Muhammed Fatih BALIN authored
-
- 19 Apr, 2024 1 commit
-
-
Triston authored
Co-authored-by:Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
-
- 12 Apr, 2024 1 commit
-
-
Muhammed Fatih BALIN authored
-
- 29 Feb, 2024 1 commit
-
-
Muhammed Fatih BALIN authored
-
- 23 Nov, 2023 1 commit
-
-
Muhammed Fatih BALIN authored
-
- 22 Nov, 2023 1 commit
-
-
Muhammed Fatih BALIN authored
-
- 14 Aug, 2023 1 commit
-
-
Xin Yao authored
Signed-off-by:Xin Yao <xiny@nvidia.com>
-
- 10 Aug, 2023 1 commit
-
-
Chang Liu authored
-
- 19 Jul, 2023 1 commit
-
-
Muhammed Fatih BALIN authored
Co-authored-by:Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
-
- 14 Jul, 2023 2 commits
-
-
Muhammed Fatih BALIN authored
-
Muhammed Fatih BALIN authored
Co-authored-by:Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
-
- 13 Jul, 2023 1 commit
-
-
Muhammed Fatih BALIN authored
Co-authored-by:
Xin Yao <xiny@nvidia.com> Co-authored-by:
Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
-
- 17 May, 2023 1 commit
-
-
nv-dlasalle authored
[Performance Improvement] Make GPU sampling and to_block use pinned memory to decrease required synchronization (#5685)
-
- 23 Mar, 2023 1 commit
-
-
Xin Yao authored
* update for segmentMM * update for sddmm * fix a bug
-
- 08 Mar, 2023 1 commit
-
-
Rhett Ying authored
-
- 12 Jan, 2023 1 commit
-
-
nv-dlasalle authored
* Add failing unit test * Add fix * Remove extra newline * skip cpu test Co-authored-by:Xin Yao <yaox12@outlook.com>
-
- 09 Dec, 2022 1 commit
-
-
Xin Yao authored
* fix empty tensor is treated as pinned * avoid calling cudaHostGetDevicePointer on nullptr * update empty array * add a comment
-
- 06 Dec, 2022 1 commit
-
-
Chang Liu authored
* Add support for next cusparse release * Fix lint * Add switch and tune the performance * Fix lint issue * Fine tune the heuristics * Fix lint issue * Address comments * Minor fix * Address comments
-
- 24 Nov, 2022 1 commit
-
-
Xin Yao authored
-
- 22 Nov, 2022 2 commits
-
-
Ping Gong authored
* Leverage hashmap to accelerate CSRSliceMatrix * fix lint check * use `min` in cuda_runtime.ch * fix hash func * add some comments and adjust the <grid,block> of the _SegmentMaskColKernel kernel * set device and stream for thrust::for_each * use thrust::cuda::par_nosync Co-authored-by:Xin Yao <xiny@nvidia.com>
-
Muhammed Fatih BALIN authored
* adding LABOR sampling * add ladies and pladies samplers * fix compile error after rebase * add reference for ladies sampler * Improve ladies implementation. * weighted labor sampling initial implementation draft fix indentation and small bug in ladies script * importance_sampling currently doesn't work with weights * fix weighted importance sampling * move labor example into its own folder * lint fixes * Improve documentation * remove examples from the main PR * fix linting by not using c++17 features * fix documentation of labor_sampler.py * update documentation for labor.py * reformat the labor.py file with black * fix linting errors * replace exception use with if * fix typo in error comment * fixing win64 build for ci * fixing weighted implementation, works now. * fix bug in the weighted case and importance_sampling==0 * address part of the reviews * remove unused code paths from cuda * remove unused code path from cpu side * remove extra features of labor making use of random seed. * fix exclude_edges bug * remove pcg and seed logic from cpu implementation, seed logic should still work for cuda. * minor style change * refactor CPU implementation, take out the importance_sampling probability computation into a function. * improve CUDAWorkspaceAllocator * refactor importance_sampling part out to a function * minor optimization * fix linting issue * Revert "remove pcg and seed logic from cpu implementation, seed logic should still work for cuda." This reverts commit c250e07ac6d7e13f57e79e8a2c2f098d777378c2. * Revert "remove extra features of labor making use of random seed." This reverts commit 7f99034353080308f4783f27d9a08bea343fb796. * fix the documentation * disable NIDs * improve the documentation in the code * use the stream argument in pcg32 instead of skipping ahead t times, can discard the use of hashmap now since it is faster this way. * fix linting issue * address another round of reviews * further optimize CPU LABOR sampling implementation * fix linting error * update the comment * reformat * rename and rephrase comment * fix formatting according to new linting specs * fix compile error due to renaming, fix linting. * lint * rename DGLHeteroGraph to DGLGraph to match master * replace other occurrences of DGLHeteroGraph to DGLGraph Co-authored-by:
Muhammed Fatih BALIN <m.f.balin@gmail.com> Co-authored-by:
Kaan Sancak <kaansnck@gmail.com> Co-authored-by:
Quan Gan <coin2028@hotmail.com>
-
- 10 Nov, 2022 1 commit
-
-
Xin Yao authored
* update accumulator * rename half to __half * add bfloat16 * simplify code * fix another case * add unit test * disable half-precision SpMMCoo * fix lint
-
- 08 Nov, 2022 1 commit
-
-
Hongzhi (Steve), Chen authored
* [Misc] Change the max line length for cpp to 80 in lint. * blabla * blabla * blabla * ablabla Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
- 07 Nov, 2022 4 commits
-
-
Hongzhi (Steve), Chen authored
* [Misc] clang-format auto fix. * blabla * nolint * blabla Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
Hongzhi (Steve), Chen authored
* blabla * more * blabla * blabla * ablabla * blabla Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
Hongzhi (Steve), Chen authored
* [Misc] clang-format auto fix. * blabla * ablabla * blabla Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
Hongzhi (Steve), Chen authored
* replace * blabla * balbla * blabla Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
- 06 Nov, 2022 2 commits
-
-
Hongzhi (Steve), Chen authored
* param * brief * note * return * tparam * brief2 * file * return2 * return * blabla * all Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
Xin Yao authored
* add bf16 specializations * remove SWITCH_BITS * enable amp for bf16 * remove SWITCH_BITS for cpu kernels * enbale bf16 based on CUDART * fix compiling for sm<80 * fix cpu build * enable unit tests * update doc * disable test for CUDA < 11.0 * address comments * address comments
-
- 03 Nov, 2022 2 commits
-
-
Hongzhi (Steve), Chen authored
* [Misc] clang-format auto fix. * manual * manual * manual * manual * todo * fix Co-authored-by:Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
-
Xin Yao authored
* get device pointers * change if condition to IsPinned
-
- 28 Oct, 2022 1 commit
-
-
Quan (Andy) Gan authored
* sample neighbors with masks * oops * refactor again * remove * remove debug code * rename macro * address comments * address comment * address comments * rename a lot of stuff * oops
-