"git@developer.sourcefind.cn:yangql/composable_kernel-1.git" did not exist on "bf975428460a27b46912d1c4293b407febb92de0"
  1. 31 Dec, 2024 1 commit
  2. 25 Dec, 2024 1 commit
  3. 17 Dec, 2024 1 commit
  4. 01 Nov, 2024 1 commit
  5. 17 Oct, 2024 1 commit
  6. 16 Oct, 2024 1 commit
    • Qianfeng's avatar
      [CK_TILE] Improve headdim96 performance for fmha-bwd (#1573) · 14c3cfb1
      Qianfeng authored
      
      
      * Add kQKHeaddimForGemmN and kVHeaddimForGemmN in order to support headdim 96
      
      * Remove the using of MakeKRegBlockDescriptor and MakeVRegBlockDescriptor
      
      * Fix in bwd_piple_default_policy
      
      * Remove kQKHeaddim and rename kQKHeaddimForGemmN to kQKHeaddim in the bwd kernel and pipelines
      
      * Replace kVHeaddimForGemmN by kVHeaddim and kDoDvHeaddim
      
      * Update to hd96 tile settings
      
      * Add smoke test scripts for fmha-bwd hd96
      
      * Revert "Add smoke test scripts for fmha-bwd hd96"
      
      This reverts commit 7ca7e1a93dc65eb99ce3ff4e82693589830e42a2.
      
      * Remove hd96 tile settings in fmha_bwd codegen to save compiling
      
      * Fix lost code line in bwd_pipeline_default_policy
      
      * Merge kDoDvHeaddim/kPadHeadDimDoDv to kVHeaddim/kPadHeadDimV and remove TileFmhaBwdTraits
      
      * Rename KRegSliceBlockDescriptor/VRegSliceBlockDescriptor to KRegBlockDescriptor/VRegBlockDescriptor
      
      * tiny adjustments
      
      ---------
      Co-authored-by: default avatarPo Yen Chen <PoYen.Chen@amd.com>
      Co-authored-by: default avatardanyao12 <Dan.Yao@amd.com>
      14c3cfb1
  7. 15 Oct, 2024 3 commits
  8. 14 Oct, 2024 3 commits
  9. 12 Oct, 2024 3 commits
  10. 11 Oct, 2024 3 commits
  11. 10 Oct, 2024 4 commits
  12. 09 Oct, 2024 3 commits
  13. 08 Oct, 2024 4 commits
  14. 07 Oct, 2024 4 commits
  15. 04 Oct, 2024 3 commits
  16. 02 Oct, 2024 2 commits
  17. 01 Oct, 2024 2 commits