"src/diffusers/utils/kernels_utils.py" did not exist on "bcd4d77ba61c9ec295d0649448f108866b766708"
  1. 27 Jun, 2022 1 commit
    • ndickson-nvidia's avatar
      [Bug][Feature] Added more missing FP16 specializations (#4140) · a5d8460c
      ndickson-nvidia authored
      * * Added missing specializations for `__half` of `DLDataTypeTraits`, `IndexSelect`, `Full`, `Scatter_`, `CSRGetData`, `CSRMM`, `CSRSum`, `IndexSelectCPUFromGPU`
      * Fixed casting issue in `_LinearSearchKernel` that was preventing it from supporting `__half`
      * Added `#if`'d out specializations of `CSRGEMM`, `CSRGEAM`, and `Xgeam`, which would require functions that aren't currently provided by cublas
      
      * * Added more specific error messages for unimplemented FP16 specializations of Xgeam, CSRGEMM, and CSRGEAM
      
      * * Added missing instantiation of DLDataTypeTraits<__half>::dtype
      
      * * Fixed linter error
      * Added clearer comment explaining why the cast to long long is necessary
      
      * * Worked around a compile error in some particular setup, where __half can't be constructed on the host side
      
      * * Fixed linter formatting errors
      
      * * Changes to comments as recommended
      
      * * Made recommended changes to logging errors in FP16 specializations
      * Also changed the existing Xgeam function for unsupported data types from LOG(INFO) to LOG(FATAL)
      a5d8460c
  2. 04 Nov, 2021 1 commit
  3. 27 Apr, 2021 1 commit
  4. 10 Sep, 2020 1 commit
  5. 30 Jul, 2020 1 commit
  6. 28 Jun, 2020 1 commit
    • Minjie Wang's avatar
      [CUDA][Kernel] More CUDA kernels; Standardize the behavior for sorted COO/CSR (#1704) · 870da747
      Minjie Wang authored
      * add cub; array cumsum
      
      * CSRSliceRows
      
      * fix warning
      
      * operator << for ndarray; CSRSliceRows
      
      * add CSRIsSorted
      
      * add csr_sort
      
      * inplace coosort and outplace csrsort
      
      * WIP: coo is sorted
      
      * mv cuda_utils
      
      * add AllTrue utility
      
      * csr sort
      
      * coo sort
      
      * coo2csr for sorted coo arrays
      
      * CSRToCOO from sorted
      
      * pass tests for the new kernel changes
      
      * cannot use inplace sort
      
      * lint
      
      * try fix msvc error
      
      * Fix g.copy_to and g.asnumbits; ToBlock no longer uses CSC
      
      * stash
      
      * revert some hack
      
      * revert some changes
      
      * address comments
      
      * fix
      
      * fix to_block unittest
      
      * add todo note
      870da747
  7. 19 Jun, 2020 1 commit
    • Minjie Wang's avatar
      [CUDA] Many CUDA operators; Prepare for DGLGraph on CUDA (#1660) · f1b19a6b
      Minjie Wang authored
      * add cuda utils; change g.to; add g.device
      
      * split array.h into several headers
      
      * cuda index select
      
      * file
      
      * three cuda kernels
      
      * add cuda elementwise arith and several others
      
      * cuda CSRIsNonZero
      
      * fix lint
      
      * lint
      
      * lint
      
      * fix bug in changing ctx to property
      
      * address comments
      
      * remove unused codes
      
      * address comments
      f1b19a6b
  8. 15 Jun, 2020 1 commit