1. 21 Sep, 2022 1 commit
  2. 06 Jun, 2022 1 commit
    • ndickson-nvidia's avatar
      [Bug] Added common operations for FP16 on older GPUs (#4079) · ea44da50
      ndickson-nvidia authored
      * * Added support for common operations on FP16 (`half` or `__half`) for older GPU architectures
      * Fixed an issue with previous check for FP16 support
      
      * * Removing FP16 type checks, since they should no longer be needed
      
      * * Fixed AtomicAdd to be atomic for `float` and `double` for old GPU architectures.  Unfortunately, it seems that atomicCAS for unsigned short seems to be unavailable until architecture 70, so half will have to stay non-atomic on old GPUs.
      
      * * Fixed non-atomic version of `AtomicAdd<half>` for older GPUs to return old value instead value of new
      ea44da50
  3. 28 Jan, 2021 1 commit