• zjing14's avatar
    Add bfp16/int8 support into XDL GEMM operator (#50) · 3737bb03
    zjing14 authored
    
    
    * init StaticBufferV2
    
    * clean
    
    * adopt old output stage for staticBufferV2
    
    * clean
    
    * remove hack
    
    * clean
    
    * clean
    
    * add parameters
    
    * clean code
    
    * move c_buffer alloc into blockwise gemm
    
    * add adaptors for m/n_thread_data_on_grid
    
    * tweak gemm
    
    * adjust blockwise_gemm_xdlops
    
    * tweak
    
    * update conv
    
    * update script
    
    * adding bwd 1x1
    
    * update script
    
    * adding 1x1 bwd
    
    * debugging bwd 1x1 failure
    
    * update script
    
    * update script
    
    * test
    
    * test v100
    
    * add bf16_1k
    
    * clang-format
    
    * clean
    
    * add bfp16 for gfx908
    
    * add verification
    
    * clean up
    
    * clean code
    
    * restore bfl16
    
    * clean
    
    * add bfp16 support into gemm_driver
    
    * apply new generator to other drivers
    
    * add int8 support
    
    * cleanb
    
    * clean
    
    * clean
    
    * clean
    Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
    Co-authored-by: default avatarChao Liu <lc.roy86@gmail.com>
    Co-authored-by: default avatarroot <root@hayabusa6111.amd.com>
    3737bb03
gemm_driver_offline.cpp 13.6 KB