• ndickson-nvidia's avatar
    [Bug][Feature] Added more missing FP16 specializations (#4140) · a5d8460c
    ndickson-nvidia authored
    * * Added missing specializations for `__half` of `DLDataTypeTraits`, `IndexSelect`, `Full`, `Scatter_`, `CSRGetData`, `CSRMM`, `CSRSum`, `IndexSelectCPUFromGPU`
    * Fixed casting issue in `_LinearSearchKernel` that was preventing it from supporting `__half`
    * Added `#if`'d out specializations of `CSRGEMM`, `CSRGEAM`, and `Xgeam`, which would require functions that aren't currently provided by cublas
    
    * * Added more specific error messages for unimplemented FP16 specializations of Xgeam, CSRGEMM, and CSRGEAM
    
    * * Added missing instantiation of DLDataTypeTraits<__half>::dtype
    
    * * Fixed linter error
    * Added clearer comment explaining why the cast to long long is necessary
    
    * * Worked around a compile error in some particular setup, where __half can't be constructed on the host side
    
    * * Fixed linter formatting errors
    
    * * Changes to comments as recommended
    
    * * Made recommended changes to logging errors in FP16 specializations
    * Also changed the existing Xgeam function for unsupported data types from LOG(INFO) to LOG(FATAL)
    a5d8460c
utils.h 7.92 KB