[Feature] Add bfloat16 (bf16) support (#4648)
* add bf16 specializations * remove SWITCH_BITS * enable amp for bf16 * remove SWITCH_BITS for cpu kernels * enbale bf16 based on CUDART * fix compiling for sm<80 * fix cpu build * enable unit tests * update doc * disable test for CUDA < 11.0 * address comments * address comments
Showing
src/array/cuda/bf16.cuh
0 → 100644
Please register or sign in to comment