1. 02 Sep, 2022 1 commit
  2. 14 Jul, 2022 1 commit
  3. 10 Jul, 2022 1 commit
  4. 04 Jul, 2022 1 commit
  5. 18 Jun, 2022 3 commits
  6. 17 Jun, 2022 3 commits
  7. 16 Jun, 2022 3 commits
  8. 14 Jun, 2022 2 commits
  9. 13 Jun, 2022 3 commits
  10. 12 Jun, 2022 2 commits
  11. 11 Jun, 2022 1 commit
  12. 10 Jun, 2022 2 commits
  13. 09 Jun, 2022 2 commits
  14. 08 Jun, 2022 4 commits
  15. 02 Jun, 2022 2 commits
  16. 01 Jun, 2022 1 commit
  17. 31 May, 2022 3 commits
    • zjing14's avatar
      Pass gemm_descs for grouped gemm via __constant__ buff (#232) · b6eaf3eb
      zjing14 authored
      * moved gemm_descs_args into const buff
      
      * use CK_CONSTANT_ADDRESS_SPACE instead of global constant
      
      * clean
      
      * moved hipMemAlloc outside of deviceOp
      
      * add SetWorkSpacePointer
      
      * fix ignore
      b6eaf3eb
    • myamlak's avatar
      Multi-kernel CGEMM (#230) · 7b1e2c37
      myamlak authored
      * Reference CGEMM + test stub
      
      * Format.
      
      * Incomplete simple implementation
      
      * Library instances
      
      * Sketch of tests
      
      * Test fixes.
      
      * Example added
      
      * Cosmetics
      
      * Add elementwise operation kernel and example
      
      * Add comment
      
      * Add template argument of dim . Prepare to support multiple dimension
      
      * Rename example
      
      * Support 1 dimension
      
      * Add static assert
      
      * Add comment
      
      * Second auxiliary buffer added
      
      * Extract pad
      
      * Remove redundant argument
      
      * Support any dimension for elementwise operation
      
      * Remove line
      
      * Let it be the multiple number of CU
      
      * Move thread per block to the parameter of constructor
      
      * Consuming binary ops to do A+B / A-B
      
      * Fix + cosmetics + bf16 test commented out temporarily
      
      * Format
      
      * Enabling bf16 test
      
      * Revert "Enabling bf16 test"
      
      This reverts commit f497e2ba.
      
      * Fix + test reenabled
      
      * fix build
      
      * Revert "fix build"
      
      This reverts commit d7310238
      
      .
      
      * post PR #235 merge fix
      
      * amend
      
      * Single workspace for cgemm + helper
      
      * Perf calc fix
      
      * Review remarks: static_cast
      
      * Review remarks: binary ops templated
      
      * Cleaning
      
      * Removal of instances and their tests
      
      * Review remarks from aosew addressed
      
      * Review remark: unnecessary attribute
      
      * Post-merge fixes
      
      * Restrict 4gemm to PassThrough + bug fix
      
      * Review remarks
      
      * update licence
      
      * change cgemm example to fp16
      Co-authored-by: default avatarrocking <chunylai@amd.com>
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      Co-authored-by: default avatarAnthony Chang <ac.chang@outlook.com>
      7b1e2c37
    • Chao Liu's avatar
      Minor fix for recent PR (#260) · 85fc91c3
      Chao Liu authored
      * fix example
      
      * update IsSupportedArgument
      
      * fix
      
      * disable fp64 conv example as test
      85fc91c3
  18. 30 May, 2022 3 commits
    • rocking5566's avatar
      gemm + layernorm (#261) · d32a67a9
      rocking5566 authored
      * Implement reduction meand and reduction square mean
      
      * Refine file name
      
      * Add reduce mean and square mean
      
      * Fix parameter name
      
      * Add normalize device op (not implement invoker::run())
      
      * Remove epislon
      
      * Refine deviceop
      
      * Add 5ary elementwise for normalization
      
      * Add layernorm example
      
      * layerNorm verication
      
      * Fix compiler error due to merge from develop
      
      * Fix typo
      
      * Fix compile error
      
      * Refine naming
      
      * [What] Suport non pointer for invoker and argument
      [Why] Snyc coding style with gemm
      
      * Refine folder name
      
      * Refine class name
      
      * Evaluate perf of the kernel
      
      * Fix compile error
      
      * [What] Refine perf evaluation in example of gemm + reduction
      [Why] evaluation of gemm + reduction may cause verification fail. Because evaluation will not initial global memory
      
      * clang-format
      d32a67a9
    • ltqin's avatar
      fix bug after merge develop · b571256f
      ltqin authored
      b571256f
    • ltqin's avatar
      change file name · 7d85d04a
      ltqin authored
      7d85d04a
  19. 29 May, 2022 2 commits