• guangzlu's avatar
    Add BF16 tests for batched_gemm_softmax_gemm_permute (#504) · 4c4c7328
    guangzlu authored
    
    
    * fixed bug in softmax reference & add bf16 examples for batched_gemm_scale_softmax_gemm
    
    * added bf16 tests for batched_gemm_softmax_gemm_permute
    
    * changed format of device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp
    
    * changed format device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp
    
    * aligned annotations
    
    * modified CMakeLists for examples
    
    * add common example code of fp16/bf16 version for batched_gemm_scale_softmax_gemm_xdl
    
    * use macro to control the instances
    
    * added macro control into instances
    
    * clang-format some files
    
    * changed error tolerance for bf16
    
    * changed index for 10_elementwise_normalization
    
    * fixed xdlops code bug in amd_xdlops.hpp
    Co-authored-by: default avatarPo Yen Chen <PoYen.Chen@amd.com>
    4c4c7328
amd_xdlops.hpp 10.5 KB