• ndickson-nvidia's avatar
    [Bug] Added common operations for FP16 on older GPUs (#4079) · ea44da50
    ndickson-nvidia authored
    * * Added support for common operations on FP16 (`half` or `__half`) for older GPU architectures
    * Fixed an issue with previous check for FP16 support
    
    * * Removing FP16 type checks, since they should no longer be needed
    
    * * Fixed AtomicAdd to be atomic for `float` and `double` for old GPU architectures.  Unfortunately, it seems that atomicCAS for unsigned short seems to be unavailable until architecture 70, so half will have to stay non-atomic on old GPUs.
    
    * * Fixed non-atomic version of `AtomicAdd<half>` for older GPUs to return old value instead value of new
    ea44da50
atomic.cuh 8.14 KB