"...python/git@developer.sourcefind.cn:change/sglang.git" did not exist on "f556ac8bd8f6cfad85ce4da6d6b10c775cb43278"
- 16 Dec, 2021 1 commit
-
-
Israt Nisa authored
[Feature] Add CUDA support for `min` and `max` reducer in heterogeneous API for unary message functions (#3566) * CUDA support max/min reducer on forward pass * docstring * concised UpdateGradMinMax_hetero * reorganized UpdateGradMinMax_hetero * CUDA kernels for max/min reducer * variable name * lint check * changed CUDA 2D thread mapping to 1D * removed legacy cusparse for min/max reducer * git CI issue * restarting git CI * adding namespace std Co-authored-by:
Israt Nisa <nisisrat@amazon.com> Co-authored-by:
Quan (Andy) Gan <coin2028@hotmail.com>
-
- 15 Dec, 2021 1 commit
-
-
Vasimuddin Md authored
* added distgnn plus libra codebase * Dist application codes * added comments in partition code. changed the interface of partitioning call. * updated readme * create libra partitioning branch for the PR * removed disgnn files for first PR * updated kernel.cc * added libra_partition.cc and moved libra code from kernel.cc to libra_partition.cc * fixed lint error; merged libra2dgl.py and main_Libra.py to libra_partition.py; added graphsage/distgnn folder and partition script. * removed libra2dgl.py * fixed the lint error and cleaned the code. * revisions due to PR comments. added distgnn/tools contains partitions routines * update 2 PR revision I * fixed errors; also improved the runtime by 10x. * fixed minor lint error * fixed some more lints * PR revision II changed the interface of libra partition function * rewrite docstring Co-authored-by:Quan (Andy) Gan <coin2028@hotmail.com>
-
- 06 Dec, 2021 1 commit
-
-
Quan (Andy) Gan authored
* first commit * second commit * spaghetti unit tests * rewrite test
-
- 03 Dec, 2021 1 commit
-
-
Israt Nisa authored
* min/max support for forward CPU heterograph * Added etype with each argU values * scatter_add needs fix * added scatter_add_hetero. Grads dont match for max reducer * storing ntype in argX * fixing scatter_add_hetero * hetero matches with torch's scatter add * works copy_e forward+cpu * added backward for copy_rhs * Computes gradient for all node types in one kernel * bug fix * unnitest for max/min on CPU * renamed scatter_add_hetero to update_grad_minmax_hetero * lint check and comment out cuda call for max. Code is for CPU only * lint check * replace inf with zero * minor * lint check * removed LIBXSMM code from hetro code * fixing backward operator of UpdateGradMinMaxHetero * removed backward from update_grad_minmax_hetero * docstring * improved docstring and coding style * Added pass by pointer for output * typos and pass by references * Support for copy_rhs * Added header <string> * fix bug in copy_u_max * Added comments and dimension check of all etypes * skip mxnet check * pass by pointer output arrays * updated docstring Co-authored-by:
Israt Nisa <nisisrat@amazon.com> Co-authored-by:
Quan (Andy) Gan <coin2028@hotmail.com>
-
- 30 Nov, 2021 1 commit
-
-
ayasar70 authored
* Based on issue #3436. Improving _SegmentCopyKernel s GPU utilization by switching to nonzero based thread assignment * fixing lint issues * Update cub for cuda 11.5 compatibility (#3468) * fixing type mismatch * tx guaranteed to be smaller than nnz. Hence removing last check * minor: updating comment * adding three unit tests for csr slice method to cover some corner cases * working on repeat * updating repeat kernel * removing unnecessary parameter * cleaning commented line * cleaning time measures * cleaning time measurement lines Co-authored-by:
Abdurrahman Yasar <ayasar@nvidia.com> Co-authored-by:
nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 17 Nov, 2021 1 commit
-
-
Israt Nisa authored
* Added SDDMMCOO_hetero support * removed redundant CUDA kernels * added benchmark for regression test * fix * fixed bug for single src node type Co-authored-by:
Israt Nisa <nisisrat@amazon.com> Co-authored-by:
Quan (Andy) Gan <coin2028@hotmail.com>
-
- 06 Nov, 2021 1 commit
-
-
ayasar70 authored
* Based on issue #3436. Improving _SegmentCopyKernel s GPU utilization by switching to nonzero based thread assignment * fixing lint issues * Update cub for cuda 11.5 compatibility (#3468) * fixing type mismatch * tx guaranteed to be smaller than nnz. Hence removing last check * minor: updating comment * adding three unit tests for csr slice method to cover some corner cases Co-authored-by:
Abdurrahman Yasar <ayasar@nvidia.com> Co-authored-by:
nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 04 Nov, 2021 1 commit
-
-
Xin Yao authored
* relabel gpu * unittest for ralebl_ on the GPU * finish Relabel_ for the GPU * copyright * re-enable the unittest for edge_subgrah on the GPU * fix unittest for tensorflow * use a fixed number of threads Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com> Co-authored-by:
nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by:
Quan (Andy) Gan <coin2028@hotmail.com>
-
- 03 Nov, 2021 1 commit
-
-
nv-dlasalle authored
-
- 18 Oct, 2021 1 commit
-
-
David Min authored
* parallelize CSRRowSlice() * use parallel_for for the second loop Co-authored-by:
nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 15 Oct, 2021 1 commit
-
-
David Min authored
* Add pytorch-direct version * remove * add documentation for UnifiedTensor * Revert "add documentation for UnifiedTensor" This reverts commit 63ba42644d4aba197c1cb4ea4b85fa1bc43b8849. * add boundary check for UVM IndexSelect * relocate boundary check index kernels to cuda * fix function name * fix indexkernel in nccl api * fix argument ordering * simplify code * Add a comment for the uvm version Co-authored-by:
shhssdm <shhssdm@gmail.com> Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com> Co-authored-by:
Minjie Wang <wmjlyjemaine@gmail.com>
-
- 17 Sep, 2021 1 commit
-
-
Rhett Ying authored
-
- 14 Sep, 2021 1 commit
-
-
Rhett Ying authored
* [Performance] improve coo2csr space complexity when row is not sorted * [Perf] replace std::vector<> by NDArray * keep both impl of unsorted coo to csr and choose according to graph density dynamically * refine criteria to choose btw Unsorted algos Co-authored-by:Ubuntu <ubuntu@ip-172-31-34-27.us-west-2.compute.internal>
-
- 13 Sep, 2021 2 commits
-
-
sanchit-misra authored
* Fixes bug #3312 * Fixing lint errors Co-authored-by:Mufei Li <mufeili1996@gmail.com>
-
Quan (Andy) Gan authored
-
- 07 Sep, 2021 1 commit
-
-
Israt Nisa authored
* Added binary builtinMsgFunc forward() for heterograph * Added backward for u_op_v * Supports all binary builtin forward * Supports binary message funcs with reduce func sum * lint check * removed import torch from unittest * enabled GPU test * lint check * Fixed docstrings * rename func get_hs_id * edited comment Co-authored-by:Israt Nisa <nisisrat@amazon.com>
-
- 06 Sep, 2021 1 commit
-
-
Jinjing Zhou authored
* remove * remove * fix * remove * remove
-
- 02 Sep, 2021 1 commit
-
-
Tomasz Patejko authored
* [CPU, Parallel] Rewriting omp pragmas with parallel_for * [CPU, Parallel] Decrease number of calls to task function * c[CPU, Parallel] Modify calls to new interface of parallel_for
-
- 01 Sep, 2021 1 commit
-
-
xiang song(charlie.song) authored
[Feature] Add a HINT for the per edge type sampler of heterogeneous DistGraph that highlighting the etypes are sorted already. (#3260) * pass cpp test * distgraph use sorted edge flag. * lint * triger * update test Co-authored-by:Ubuntu <ubuntu@ip-172-31-2-66.ec2.internal>
-
- 31 Aug, 2021 1 commit
-
-
nv-dlasalle authored
* Optimize sampling * Stop initialization of array * Fix includes for linting * Move comment * Fix replace Co-authored-by:Da Zheng <zhengda1936@gmail.com>
-
- 24 Aug, 2021 1 commit
-
-
Quan (Andy) Gan authored
-
- 19 Aug, 2021 1 commit
-
-
nv-dlasalle authored
* Update filter code * Add unit tests * Fixes * Switch to indices * Rename functions * Fix linting * Fix whitespace * Add doc * Fix heterograph * Change workspace allocation * Fix linting * Fix docs in filter.py * Add todo Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com> Co-authored-by:
Quan (Andy) Gan <coin2028@hotmail.com>
-
- 18 Aug, 2021 1 commit
-
-
Quan (Andy) Gan authored
-
- 17 Aug, 2021 1 commit
-
-
David Min authored
* Add pytorch-direct version * remove * add documentation for UnifiedTensor * Revert "add documentation for UnifiedTensor" This reverts commit 63ba42644d4aba197c1cb4ea4b85fa1bc43b8849. * alignment fix for UnifiedTensor access * fix linting issue Co-authored-by:
shhssdm <shhssdm@gmail.com> Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com>
-
- 02 Aug, 2021 1 commit
-
-
nv-dlasalle authored
* Split out separate generators for each thread * Amortize cost of curand_init * Improve readability Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com> Co-authored-by:
Quan (Andy) Gan <coin2028@hotmail.com>
-
- 28 Jul, 2021 1 commit
-
-
xiang song(charlie.song) authored
* fix. * fix. * fix. * fix. * Fix test * Deprecate old DistEmbedding impl, use synchronized embedding impl * Basic imple of heterogeneous on homogenenous sampling * make pass * Pass C++ test * Add python test code * lint * lint * Add MultiLayerEtypeNeighborSampler * Add unitest for single machine dataloader * Add dist dataloader test for edge type sampler * Fix lint * fix * support for per etype sample * Fix some bug and enable distributed training with per edge sample * fix * Now distributed training works * turn off some mxnet * turn off mxnet for some dist test * fix * upd * upd according to the comments * Fix * Fix test and now distributed works. * upd * upd * Fix * Fix bug * remove dead code. * upd * Fix * upd * Fix Co-authored-by:
Ubuntu <ubuntu@ip-172-31-71-112.ec2.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-2-66.ec2.internal> Co-authored-by:
Da Zheng <zhengda1936@gmail.com>
-
- 21 Jul, 2021 1 commit
-
-
Jinjing Zhou authored
* remove redundant fill * trigger ci
-
- 16 Jul, 2021 1 commit
-
-
David Min authored
[Feature][Performance][GPU] Introducing UnifiedTensor for efficient zero-copy host memory access from GPU (#3086) * Add pytorch-direct version * Initial commit of unified tensor * Merge branch 'master' of https://github.com/davidmin7/dgl * Remove unnecessary things * Fix error message * Fix/Add descriptions * whitespace fix * add unpin * disable IndexSelectCPUFromGPU with no CUDA * add a newline for unified_tensor.py * Apply changes based on feedback * add 'os' module * skip unified tensor unit test for cpu only * Update tests/pytorch/test_unified_tensor.py Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com> * reflect feedback Co-authored-by:
shhssdm <shhssdm@gmail.com> Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com> Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com>
-
- 13 Jul, 2021 1 commit
-
-
sanchit-misra authored
* optimizations of spmm for CPU * Added names of contributors * Minor code cleanup * Moved the spmm optimization code to a new header file * Moved to DGL's logging method * removed duplicate code between SpMMSumCsr and SpMMCmpCsr * Changes made to follow Google coding style * Fixed lint errors in spmm.h * Fixed some lint errors from spmm_blocking_libxsmm.h * Fixed lint errors from spmm_blocking_libxsmm.h * Added comments to SpMMCreateLibxsmmKernel * to enable building of tests, and other cosmetic changes * disabling libxsmm on windows * Put a condition to avoid opt impl for FP64 as libxsmm does not have FP64 support yet * cosmetic changes and documentation * cosmetic changes * to pass lint tests * replaced multiple allocations for buffers of indices and edges with a single allocation Co-authored-by:Minjie Wang <wmjlyjemaine@gmail.com>
-
- 08 Jul, 2021 2 commits
-
-
Jinjing Zhou authored
-
Quan (Andy) Gan authored
-
- 06 Jul, 2021 1 commit
-
-
Israt Nisa authored
[Feature] Add Heterograph support on Python for builtin unary msg functions (copy_u, copy_e) (#2989) * heterograph for binary func * Added SDDMM support * Added unittest * added binary test cases * unary mfuncs works * Fixed lint err * lint check and others * link check * fixed import *_hetero issue * lint check * replace torch with dgl backend * lint cehck * removed torch from test * skip mxnet unittest * skip gpu test * Remove unused/duplicated code * minor * changed data structure of ndata and edata * link check * reorganized * minor lint * minor lint * raise error for udf func * lint check * fix for CUDA 10.1 * add a note for future cross-type max/min reducing * Add support CUDA < 11 * lint check * tidied C code * remove dummy GSDDMM_hetero backward implementation Co-authored-by:
Israt Nisa <nisisrat@amazon.com> Co-authored-by:
Minjie Wang <wmjlyjemaine@gmail.com> Co-authored-by:
Quan Gan <coin2028@hotmail.com>
-
- 23 Jun, 2021 1 commit
-
-
Qidong Su authored
* update * update * update * update * lint * lint * update * update * update * update * update * update * update * update * update * update * update * update * update * update * lint * update * clone * update * update * update * update * replace idarray with ndarray * refactor cpp part * refactor python part * debug * refactor interface * test and doc * lint and test * lint * fix * fix * fix * const * doc * fix * fix * fix * fix * fix & doc * fix * fix * update * update * update * merge * doc * doc * lint * fix * more tests * doc * fix * fix * update * update * update * fix * fix Co-authored-by:Minjie Wang <wmjlyjemaine@gmail.com>
-
- 22 Jun, 2021 1 commit
-
-
Israt Nisa authored
* Added heterograph support SpMM, SDDMM * bug fix cuda stream * add cudaStrm destroy and fix whitespace * Added heterograph support SpMM, SDDMM * bug fix cuda stream * add cudaStrm destroy and fix whitespace * changed max stream = 1 * Fixed ctx * using default stream * Added heterograph support SpMM, SDDMM * bug fix cuda stream * add cudaStrm destroy and fix whitespace * changed max stream = 1 * Fixed ctx * using default stream * fix bug in copy_rhs * changed by mistake * minor datatype change * added datatype check Co-authored-by:Israt Nisa <nisisrat@amazon.com>
-
- 10 Jun, 2021 1 commit
-
-
Mufei Li authored
* Update * Update * Update * Update * Update * Update * Update * Update * Update * Add files via upload * Add files via upload * Add files via upload * Add files via upload * Update * Update * Add files via upload * Add files via upload * Update * Lint * Add files via upload * Lint * Update * Update * Update * Update * Update * Lint Fix * Lint Co-authored-by:
Ubuntu <ubuntu@ip-172-31-12-161.us-west-2.compute.internal> Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com> Co-authored-by:
xiang song(charlie.song) <classicxsong@gmail.com>
-
- 03 Jun, 2021 1 commit
-
-
Israt Nisa authored
* SpMM for heterograph * C APIs SDDMM heterograph * passes initial result * renamed eid with nid * aggregation on same ntype for multiple etypes * fix link check failure * lint check part 2 * lint check part 3 * Fixed SpMMCmpCsr Min op * added mem references * fixed fill(Max/Min), added const * removed newline * brought back docstring Co-authored-by:
Israt Nisa <nisisrat@amazon.com> Co-authored-by:
Da Zheng <zhengda1936@gmail.com>
-
- 01 Jun, 2021 1 commit
-
-
Qidong Su authored
* update * update * update * update * lint * lint * update * update * update * update * update * update * update * update * update * update * update * update * update * update * lint * update * clone * update * update * update * update * replace idarray with ndarray * refactor cpp part * refactor python part * debug * refactor interface * test and doc * lint and test * lint * fix * fix * fix * const * doc * fix * fix * fix * fix * fix & doc * fix * fix * fix * fix * fix * fix * update Co-authored-by:Minjie Wang <wmjlyjemaine@gmail.com>
-
- 20 May, 2021 1 commit
-
-
nv-dlasalle authored
[Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings and sparse gradients. (#2825) * Split NCCL wrapper from sparse optimizer and sparse embedding * Add more unit tests for single node nccl * Fix unit test for tf * Switch to device histogram * Fix histgram issues * Finish migration to histogram * Handle cases with zero send/recieve data * Start on partition object * Get compiling * Updates * Add unit tests * Switch to partition object * Fix linting issues * Rename partition file * Add python doc * Fix python assert and finish doxygen comments * Remove stubs for range based partition to satisfy pylint * Wrap unit test in GPU only * Wrap explicit cuda call in ifdef * Merge with partition.py * update docstrings * Cleanup partition_op * Add Workspace object * Switch to using workspace object * Move last remainder based function out of nccl_api * Add error messages * Update docs with examples * Fix linting erros Co-authored-by:xiang song(charlie.song) <classicxsong@gmail.com>
-
- 17 May, 2021 1 commit
-
-
Quan (Andy) Gan authored
* test commit * fixes * oops * add docs * lint * why does it say I have a trailing whitespace * oh ok * fixes * why there's an invalid argument error * address comments * fix * address comments
-
- 28 Apr, 2021 1 commit
-
-
xiang song(charlie.song) authored
Co-authored-by:
Ubuntu <ubuntu@ip-172-31-1-191.ec2.internal> Co-authored-by:
Quan (Andy) Gan <coin2028@hotmail.com>
-