• Jeff Daily's avatar
    enable ROCm build; add BF16 for ROCm and CUDA (#325) · 9a651d91
    Jeff Daily authored
    * first step, everything compiles
    
    * fix rebuilds; skip cuda version check for rocm
    
    * use macro for __shfl_up_sync __shfl_down_sync
    
    * add BFloat16 support for ROCm and CUDA
    
    * add USE_ROCM definition to setup.py
    
    * flake8 fixes
    9a651d91
setup.py 4.58 KB