• zjing14's avatar
    Grouped Gemm device with multiD grid (#319) · 7959dad5
    zjing14 authored
    
    
    * replace gridwise_v2r3 with multiD
    
    * adjust parameters
    
    * add instances
    
    * fixed test_grouped_gemm
    
    * fix standalone softmax race condition around blockwise reduction
    
    * fixed ci
    
    * fixed comment: remove redundant workspace
    
    * use instanceFactory
    
    * add test layout
    
    * add empty Ds
    
    * add bias example
    
    * use array
    
    * sperate examples
    Co-authored-by: default avatarAnthony Chang <ac.chang@outlook.com>
    7959dad5
CMakeLists.txt 1.82 KB