• Illia Silin's avatar
    Merge from internal (#1857) · 800cf897
    Illia Silin authored
    * enable batched_gemm_softmax_gemm_perm_wmma for gfx12
    
    * disable instances with blocksize=256 in attention examples
    
    * debuggging
    
    * debug
    
    * fixed lds_enabled
    
    * debugging
    
    * Fix and add limit to skiplds feature
    
    * Enable skipLds feature and fix compilation bugs
    
    * add ck_tile definitions for gfx12
    
    * fix clang format and test/wmma_op
    
    * updage instances cmake for gfx12
    
    * disable the test_wmma_op on gfx12
    
    * fix the builds for gfx950
    
    * add gfx12 and gfx950 to default target list
    
    * clean-up cmake file
    
    * Initial introduction of OFP8 data types.
    
    * Renamed FP8 and BF8 tests into FP8_FNUZ and BF8_FNUZ.
    
    * Implementation of ConvertFP32Nearest in test_fp8_ocp.
    
    * Remove dependence on possibly undeclared alias.
    
    * Implement FP8OCP test for stochastic rounding mode.
    
    * Implement FP8OCP tests for half_t type conversions.
    
    * enable bf16 atomic add on gfx950
    
    * Implement ConvertFP32Nearest test.
    
    * Implement ConvertFP32Stochastic test.
    ...
    800cf897
Jenkinsfile 54.5 KB