• Haocong WANG's avatar
    [Navi3x] Add Device Operations (#567) · 0cfda84d
    Haocong WANG authored
    * wmma_op + unit test
    
    * add arch limitation to wmma test
    
    * change arch limitation
    
    * Refactor + Add all type unit test(int4 compile failed)
    
    * Add f32_16x16x16_bf16 unit test
    
    * tempsave
    
    * tempsave
    
    * tempsave
    
    * runtime bug, cannot find symbol
    
    * workaround for incorrect HIP warpSize return value
    
    * debugging
    
    * tempsave
    
    * Correctness OK, waiting for optimization
    
    * Tidy up + format
    
    * temp save
    
    * temp save, reproduce the v_bfi_b32 issue
    
    * add inline asm for wmmaop test
    
    * tidy up
    
    * clean some debug purpose code
    
    * discard some codes
    
    * clang format
    
    * clang format
    
    * compiler issue fixed + increase tile size
    
    * navi3x_multipleD+example
    
    * temp save
    
    * workable
    
    * batchedgemm[OK], groupconv[debug]
    
    * groupconv: Sanity check[OK], Performance[Bad]
    
    * navi3x_groupconv_need_optimization
    
    * format
    
    * Add arch limitation to all wmma examples
    
    * fix bug: example30 input conv args
    0cfda84d
CMakeLists.txt 1.83 KB