1. 11 Sep, 2023 1 commit
  2. 08 Sep, 2023 1 commit
  3. 07 Sep, 2023 2 commits
  4. 06 Sep, 2023 1 commit
  5. 01 Sep, 2023 1 commit
  6. 31 Aug, 2023 7 commits
    • Rostyslav Geyyer's avatar
      Merge branch 'develop' into lwpck-756 · 1ddc3ec7
      Rostyslav Geyyer authored
      1ddc3ec7
    • Rostyslav Geyyer's avatar
      e9703d5b
    • Rostyslav Geyyer's avatar
      Remove is_native method · 4ace6c86
      Rostyslav Geyyer authored
      4ace6c86
    • Rostyslav Geyyer's avatar
      53dba87a
    • zjing14's avatar
      Grouped Gemm with Fixed K and N with SplitK (#818) · f5ec04f0
      zjing14 authored
      
      
      * move all arguments into device
      
      * add b2c_tile_map
      
      * add examples
      
      * add SetDeviceKernelArgs
      
      * dedicated fixed_nk solution
      
      * init client api
      
      * add grouped_gemm_bias example
      
      * add a instance
      
      * add instances
      
      * formatting
      
      * fixed cmake
      
      * Update EnableCompilerWarnings.cmake
      
      * Update cmake-ck-dev.sh
      
      * clean; fixed comments
      
      * fixed comment
      
      * add instances for fp32 output
      
      * add instances for fp32 output
      
      * add fp32 out client example
      
      * fixed CI
      
      * init commit for kbatch
      
      * add splitk gridwise
      
      * format
      
      * fixed
      
      * clean deviceop
      
      * clean code
      
      * finish splitk
      
      * fixed instances
      
      * change m_loops to tile_loops
      
      * add setkbatch
      
      * clean code
      
      * add splitK+bias
      
      * add instances
      
      * opt mk_nk instances
      
      * clean examples
      
      * fixed CI
      
      * remove zero
      
      * finished non-zero
      
      * clean
      
      * clean code
      
      * optimized global_barrier
      
      * fixed ci
      
      * fixed CI
      
      * removed AddBias
      
      * format
      
      * fixed CI
      
      * fixed CI
      
      * move 20_grouped_gemm to 21_grouped_gemm
      
      ---------
      Co-authored-by: default avatarJing Zhang <jizha@amd.com>
      f5ec04f0
    • rocking's avatar
      MaxPool & AvgPool bwd instances, test, ckProfiler, client example (#861) · 866377de
      rocking authored
      * Add maxpool instances
      
      * Rename index pool to max pool.
      
      * Add maxpool bwd bf16 instances
      
      * Add avg pool bwd instances
      
      * Rename avgpool and maxpool to avg_pool3d and max_pool
      
      * Add bf16 pool fwd instances
      
      * Add max pool bwd to ckProfiler
      
      * Add avg pool3d bwd to ckProfiler
      
      * Add avg pool bwd test
      
      * Fix bug of reference pool fwd (dilation)
      
      * Fix bug of max pool bwd  (dilation and initZero)
      
      * Support bf16 compute data type
      
      * Force compute type be f32. Because atomicAdd only support f32
      
      * Add max pool bwd test
      
      * Rename folder
      
      * Rename pool
      
      * Add max pool bwd client example
      
      * Add avg pool bwd client example
      
      * Add missing workspace
      
      * clang format
      
      * Rename macro
      
      * remove useless header
      
      * remove useless layout
      866377de
    • Illia Silin's avatar
      fix gemm_streamk example on mi300 (#875) · bf1912ed
      Illia Silin authored
      bf1912ed
  7. 30 Aug, 2023 6 commits
  8. 29 Aug, 2023 6 commits
  9. 28 Aug, 2023 1 commit
  10. 23 Aug, 2023 4 commits
  11. 22 Aug, 2023 3 commits
  12. 18 Aug, 2023 1 commit
  13. 17 Aug, 2023 1 commit
  14. 14 Aug, 2023 2 commits
    • Bartlomiej Wroblewski's avatar
      d4c84256
    • rocking's avatar
      Refactor pool fwd (#815) · f60f0a5e
      rocking authored
      * Do not hardcode stride
      
      * devicePool2DFwd Inherit devicePool3DFwd
      
      * Move instance declaration out of common
      
      * Add dilation
      
      * use the pool3d rank, because pool2d inherit pooo3d
      
      * calculate Do Ho Wo for the dilation
      
      * Fix header name
      
      * Modify ckProfiler
      
      * Remove pool2d instance
      
      * Remove pool2d in profiler
      
      * Remove pool2d and add dilation
      
      * In to client example, this commit revise following:
      1. Add dilation.
      2. Use pool3d to implement pool2d
      
      * Refine naming and IsSupportedArgument()
      
      * Add dilation to maxpool bwd example
      
      * clang format
      
      * 1. Remove useless header
      2. Fix copyright
      3. Refine naming
      
      * Add layout parameter to pool fwd
      
      * clang format
      
      * Fix merge error
      
      * Fix compile error
      
      * Remove layout parameter in derived class
      
      * Refine changlog
      
      * Fix compile error
      
      * Fix compiler error
      
      * Add layout to external api and profiler
      f60f0a5e
  15. 11 Aug, 2023 2 commits
  16. 10 Aug, 2023 1 commit