"git@developer.sourcefind.cn:OpenDAS/torch-sparce.git" did not exist on "39d463b858c799feb2d8a051729014bb95f69d8c"
- 06 Jun, 2022 1 commit
-
-
ndickson-nvidia authored
* * Added support for common operations on FP16 (`half` or `__half`) for older GPU architectures * Fixed an issue with previous check for FP16 support * * Removing FP16 type checks, since they should no longer be needed * * Fixed AtomicAdd to be atomic for `float` and `double` for old GPU architectures. Unfortunately, it seems that atomicCAS for unsigned short seems to be unavailable until architecture 70, so half will have to stay non-atomic on old GPUs. * * Fixed non-atomic version of `AtomicAdd<half>` for older GPUs to return old value instead value of new
-
- 26 May, 2022 1 commit
-
-
nv-dlasalle authored
* Enable FP16 for GPU builds in CI * Limit default GPU archs to pascal and above * Disable FP16 dispatching for cuda architectures less than 60 * Fix linting * Fix typos
-
- 18 Feb, 2022 1 commit
-
-
ayasar70 authored
* Based on issue #3436. Improving _SegmentCopyKernel s GPU utilization by switching to nonzero based thread assignment * fixing lint issues * Update cub for cuda 11.5 compatibility (#3468) * fixing type mismatch * tx guaranteed to be smaller than nnz. Hence removing last check * minor: updating comment * adding three unit tests for csr slice method to cover some corner cases * timing repeatkernel * clean * clean * clean * updating _SegmentMaskColKernel * Working on requests: removing sorted array check and adding comments to utility functions * fixing lint issue Co-authored-by:
Abdurrahman Yasar <ayasar@nvidia.com> Co-authored-by:
nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 07 Jan, 2022 1 commit
-
-
Quan (Andy) Gan authored
* first commit * a bunch of fixes * add unique * lint * lint * lint * address comments * Update negative_sampler.py * fix * description * address comments and fix * fix * replace unique with replace * test pylint * Update negative_sampler.py
-
- 27 Apr, 2021 1 commit
-
-
Israt Nisa authored
* init cuda support * cuSPARSE err * passed unittest for csr_mm/SpGEMM. int64 not supported * Debugging cuSPARSE error 3 * csrgeam only supports int32? * disabling int64 for cuda * refactor and add CSRMask * lint * oops * remove todo * rewrite CSRMask with CSRGetData * lint * fix test * address comments * lint * fix * addresses comments and rename BUG_ON Co-authored-by:
Israt Nisa <nisisrat@amazon.com> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-30-71.ec2.internal> Co-authored-by:
Quan Gan <coin2028@hotmail.com> Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com> Co-authored-by:
Minjie Wang <wmjlyjemaine@gmail.com>
-
- 25 Mar, 2021 1 commit
-
-
Quan (Andy) Gan authored
* disable cpu fp16 * spell mistakes
-
- 28 Jan, 2021 1 commit
-
-
Zihao Ye authored
* add tvm as submodule * compilation is ok but calling fails * can call now * pack multiple modules, change names * upd * upd * upd * fix cmake * upd * upd * upd * upd * fix * relative path * upd * upd * upd * singleton * upd * trigger * fix * upd * count reducible * upd * upd * upd * upd * upd * upd * upd * upd * upd * only keep related files * upd * upd * upd * upd * lint * lint * lint * lint * pylint * upd * upd * compilation * fix * upd * upd * upd * upd * upd * upd * upd doc * refactor * fix * upd number Co-authored-by:
Zhi Lin <linzhilynn@gmail.com> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-42-78.us-east-2.compute.internal> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-21-156.us-east-2.compute.internal> Co-authored-by:
Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 25 Jan, 2021 1 commit
-
-
Zihao Ye authored
* upd * upd * upd * upd * fix * upd * upd Co-authored-by:Jinjing Zhou <VoVAllen@users.noreply.github.com>
-
- 10 Sep, 2020 1 commit
-
-
Zihao Ye authored
* upd * upd * upd * upd * upd * upd * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * fix * upd Co-authored-by:VoVAllen <jz1749@nyu.edu>
-
- 28 Jun, 2020 1 commit
-
-
Minjie Wang authored
* add cub; array cumsum * CSRSliceRows * fix warning * operator << for ndarray; CSRSliceRows * add CSRIsSorted * add csr_sort * inplace coosort and outplace csrsort * WIP: coo is sorted * mv cuda_utils * add AllTrue utility * csr sort * coo sort * coo2csr for sorted coo arrays * CSRToCOO from sorted * pass tests for the new kernel changes * cannot use inplace sort * lint * try fix msvc error * Fix g.copy_to and g.asnumbits; ToBlock no longer uses CSC * stash * revert some hack * revert some changes * address comments * fix * fix to_block unittest * add todo note
-
- 22 Jun, 2020 1 commit
-
-
Zihao Ye authored
* udp * simplify * sddmm dot cpu * upd * format * upd * compatible with MJ's PR * lint * upd * upd * upd * python end * upd * upd * lint * lint * upd * upd * upd * upd * upd * lint * fix mxnet * upd * lint * use minjie's ptr * macro * upd * reorg * lint * fix corner cases * upd * enrich cpu docs * upd * upd * lint * lint * pylint * sx review * improve docstring * python doc * upd * restructure * lint * upd test * upd * pylint * fix corner cases and test
-
- 19 Jun, 2020 1 commit
-
-
Minjie Wang authored
* add cuda utils; change g.to; add g.device * split array.h into several headers * cuda index select * file * three cuda kernels * add cuda elementwise arith and several others * cuda CSRIsNonZero * fix lint * lint * lint * fix bug in changing ctx to property * address comments * remove unused codes * address comments
-