1. 10 Jan, 2025 1 commit
    • Thomas Ning's avatar
      Ck tile/gemm perf measure (#1750) · 73a076ee
      Thomas Ning authored
      
      
      * Finished adding the performance benchmark for ck tile gemm
      
      * Fix the executable rename problem
      
      * fix the executable name error
      
      * delete the unsupported layout combinations
      
      * Update run_full_test.sh
      
      * Update benchmark_mem_pipeline.sh
      
      * Update benchmark_basic.sh
      
      * change the executable of gemm_universal
      
      * change ck_tile_gemm script permissions
      
      * Addressed the comment
      
      * Addressed the comment
      
      * Fixed the comments
      
      * Fixed Comment
      
      * roll back the malfunctioned change
      
      * Fix the Typo
      
      * finalize the tile_gemm_fp16 performance monitoring
      
      * fix the stash names for ck_tile gemm logs
      
      * change the stashing logic
      
      * change stashing syntax
      
      ---------
      Co-authored-by: default avatarIllia Silin <98187287+illsilin@users.noreply.github.com>
      Co-authored-by: default avatarillsilin <Illia.Silin@amd.com>
      73a076ee
  2. 12 Nov, 2024 1 commit
  3. 08 Oct, 2024 1 commit
    • Po Yen Chen's avatar
      [CK_TILE] Update example README files & fix script compatibility issue (#1548) · 0c094daa
      Po Yen Chen authored
      * Fix text alignment of ArgParser::print()
      
      * Update example README files
      
      * Clarify make-ck-dev.sh <arch> usage
      
      * Only keep some of the argument from '-?' output
      
      * Undo command line output changes in README
      
      * Only keep existing argument on doc and update description
      
      * Fix text alignment
      
      * Make cmake-ck-*.sh compatible with 'sh' command
      0c094daa
  4. 07 Sep, 2024 1 commit
    • Thomas Ning's avatar
      Ck tile gemm example (#1488) · caacd388
      Thomas Ning authored
      
      
      * Checkpoint: Finished with the tile example & kernel verification, working on the different matrix layout
      
      * Finished the Matrix Layout feature set up. Note: Need to modify the inner block to solve the shuffle problem in the future.
      
      * Fix: Clang Format, API fixed from fmha
      
      * fix with better naming convention
      
      * revert back the pipeline code of fmha
      
      * Fixed: Addressed the comments and merge the GEMM shape of GEMM Operator and FMHA Operator to one.
      
      * clang format with the reference_gemm file
      
      * convert the clang format with the remod.py
      
      * Changed the format and variable name of the kernel gemm_shape and partitioner
      
      ---------
      Co-authored-by: default avatarthomasning <thomasning@banff-cyxtera-s70-4.ctr.dcgpu>
      caacd388