1. 31 Aug, 2023 3 commits
    • zjing14's avatar
      Grouped Gemm with Fixed K and N with SplitK (#818) · f5ec04f0
      zjing14 authored
      
      
      * move all arguments into device
      
      * add b2c_tile_map
      
      * add examples
      
      * add SetDeviceKernelArgs
      
      * dedicated fixed_nk solution
      
      * init client api
      
      * add grouped_gemm_bias example
      
      * add a instance
      
      * add instances
      
      * formatting
      
      * fixed cmake
      
      * Update EnableCompilerWarnings.cmake
      
      * Update cmake-ck-dev.sh
      
      * clean; fixed comments
      
      * fixed comment
      
      * add instances for fp32 output
      
      * add instances for fp32 output
      
      * add fp32 out client example
      
      * fixed CI
      
      * init commit for kbatch
      
      * add splitk gridwise
      
      * format
      
      * fixed
      
      * clean deviceop
      
      * clean code
      
      * finish splitk
      
      * fixed instances
      
      * change m_loops to tile_loops
      
      * add setkbatch
      
      * clean code
      
      * add splitK+bias
      
      * add instances
      
      * opt mk_nk instances
      
      * clean examples
      
      * fixed CI
      
      * remove zero
      
      * finished non-zero
      
      * clean
      
      * clean code
      
      * optimized global_barrier
      
      * fixed ci
      
      * fixed CI
      
      * removed AddBias
      
      * format
      
      * fixed CI
      
      * fixed CI
      
      * move 20_grouped_gemm to 21_grouped_gemm
      
      ---------
      Co-authored-by: default avatarJing Zhang <jizha@amd.com>
      f5ec04f0
    • rocking's avatar
      MaxPool & AvgPool bwd instances, test, ckProfiler, client example (#861) · 866377de
      rocking authored
      * Add maxpool instances
      
      * Rename index pool to max pool.
      
      * Add maxpool bwd bf16 instances
      
      * Add avg pool bwd instances
      
      * Rename avgpool and maxpool to avg_pool3d and max_pool
      
      * Add bf16 pool fwd instances
      
      * Add max pool bwd to ckProfiler
      
      * Add avg pool3d bwd to ckProfiler
      
      * Add avg pool bwd test
      
      * Fix bug of reference pool fwd (dilation)
      
      * Fix bug of max pool bwd  (dilation and initZero)
      
      * Support bf16 compute data type
      
      * Force compute type be f32. Because atomicAdd only support f32
      
      * Add max pool bwd test
      
      * Rename folder
      
      * Rename pool
      
      * Add max pool bwd client example
      
      * Add avg pool bwd client example
      
      * Add missing workspace
      
      * clang format
      
      * Rename macro
      
      * remove useless header
      
      * remove useless layout
      866377de
    • Illia Silin's avatar
      fix gemm_streamk example on mi300 (#875) · bf1912ed
      Illia Silin authored
      bf1912ed
  2. 30 Aug, 2023 2 commits
  3. 29 Aug, 2023 1 commit
  4. 28 Aug, 2023 1 commit
  5. 23 Aug, 2023 1 commit
    • Jun Liu's avatar
      [HotFix] add config and version files to pass on build info (#856) · c8a8385f
      Jun Liu authored
      * experiment with config file
      
      * experiment with version.h config
      
      * add more info to version.h
      
      * minor updates
      
      * minor updates
      
      * fix case where DTYPE is not used
      
      * large amount of files but minor changes
      
      * remove white space
      
      * minor changes to add more MACROs
      
      * fix cmakedefine01
      
      * fix issue with CK internal conflict
      
      * fix define and define value
      
      * fix clang-format
      
      * fix formatting issue
      
      * experiment with cmake
      
      * clang format v12 to be consistent with miopen
      
      * avoid clang-format for config file
      c8a8385f
  6. 22 Aug, 2023 3 commits
  7. 18 Aug, 2023 1 commit
  8. 17 Aug, 2023 1 commit
  9. 16 Aug, 2023 1 commit
  10. 14 Aug, 2023 3 commits
    • Bartlomiej Wroblewski's avatar
      d4c84256
    • Jing Zhang's avatar
      fixed CI · 54df59bf
      Jing Zhang authored
      54df59bf
    • rocking's avatar
      Refactor pool fwd (#815) · f60f0a5e
      rocking authored
      * Do not hardcode stride
      
      * devicePool2DFwd Inherit devicePool3DFwd
      
      * Move instance declaration out of common
      
      * Add dilation
      
      * use the pool3d rank, because pool2d inherit pooo3d
      
      * calculate Do Ho Wo for the dilation
      
      * Fix header name
      
      * Modify ckProfiler
      
      * Remove pool2d instance
      
      * Remove pool2d in profiler
      
      * Remove pool2d and add dilation
      
      * In to client example, this commit revise following:
      1. Add dilation.
      2. Use pool3d to implement pool2d
      
      * Refine naming and IsSupportedArgument()
      
      * Add dilation to maxpool bwd example
      
      * clang format
      
      * 1. Remove useless header
      2. Fix copyright
      3. Refine naming
      
      * Add layout parameter to pool fwd
      
      * clang format
      
      * Fix merge error
      
      * Fix compile error
      
      * Remove layout parameter in derived class
      
      * Refine changlog
      
      * Fix compile error
      
      * Fix compiler error
      
      * Add layout to external api and profiler
      f60f0a5e
  11. 11 Aug, 2023 1 commit
  12. 10 Aug, 2023 1 commit
    • rocking's avatar
      Average pool backward deviceOP and example (#797) · 578142db
      rocking authored
      * Add avgpool bwd reference code
      
      * Refine naming
      
      * Fix invalid in_element op in ref_conv
      
      * Add example (only reference now)
      
      * Add the full example of avgpool bwd
      
      * Fix copyright
      
      * Imitate MakeDescriptor from  transform_conv_bwd_data_to_gemm_v1.hpp
      
      * rename channel to c from k
      
      * Arrange the code
      
      * Imitate the argument from conv bwd
      
      * Implement invoker
      
      * Fix order of parameter in example
      
      * Refactor reference code for different dimension
      
      * Support different stride
      
      * Check if argument is valid
      
      * Fix kernel parameter for NDHWC, fastest dimension C is not reduced
      
      * Add more data type in example
      
      * Fix bug in example
      
      * calculate Do Ho Wo according to the dilation
      
      * Remove useless header
      
      * Add comment in reference code
      
      * Add layout parameter
      
      * Remove layout in derived class
      
      * Refine reference comment
      578142db
  13. 09 Aug, 2023 2 commits
  14. 07 Aug, 2023 1 commit
  15. 05 Aug, 2023 1 commit
  16. 03 Aug, 2023 3 commits
  17. 02 Aug, 2023 1 commit
  18. 30 Jul, 2023 1 commit
  19. 27 Jul, 2023 7 commits
  20. 26 Jul, 2023 5 commits
    • carlushuang's avatar
      initial stream-k implementation with example (#699) · e7dca79d
      carlushuang authored
      
      
      * initial stream-k implementation with example
      
      * fix unexpected change in err
      
      * improve a little bit performance by reorganize pipeline.
      
      * improve perf a little bit by swizzle block idx
      
      * add profiler
      
      * update example
      
      * fix spelling
      
      * shrink karg for streamk
      
      * support dynamic buffer using memory coherence glc_slc bit from template
      
      * control memory coherence while construct dynamic buffer
      
      * update reduction for streamk(not ready yet)
      
      * Add template parameter to make_dynamic_buffer to support amd_buffer coherence setting
      
      * fix build issue
      
      * fix several bug
      
      * now result is correct, everything works (but has scratch)
      
      * remove scratch by manually reset coordinate
      
      * update device code
      
      * fix a bug in final reduce
      
      * fix something in example
      
      * update async memset
      
      * fix enum as camel case
      
      * modify coherence enum name
      
      * clean code and use atomic streamk by default
      
      * remove unused var
      
      * throw exception if have empty pointer
      
      * fix format
      
      * fix CI warning
      
      * fix type in init
      
      * modify CI error
      
      * filter out on gfx10+
      
      * restore changed example code
      
      ---------
      Co-authored-by: default avatarQianfeng Zhang <Qianfeng.Zhang@amd.com>
      e7dca79d
    • Jing Zhang's avatar
      clean deviceop · 91075f0f
      Jing Zhang authored
      91075f0f
    • Bartłomiej Kocot's avatar
      Disable XDL kernels on unsupported HW Add ck::is_xdl_supported (#768) · ac6d68b3
      Bartłomiej Kocot authored
      
      
      * Disable XDL kernels on unsupported HW; Add ck::is_xdl_supported function (#765)
      
      * Do not throw an error when GEMM problem is not supported.
      
      ---------
      Co-authored-by: default avatarBartlomiej Wroblewski <bwroblewski10@gmail.com>
      Co-authored-by: default avatarAdam Osewski <aosewski@amd.com>
      Co-authored-by: default avatarIllia Silin <98187287+illsilin@users.noreply.github.com>
      ac6d68b3
    • Jing Zhang's avatar
      fixed · c0264b8f
      Jing Zhang authored
      c0264b8f
    • Jing Zhang's avatar
      format · 510d6464
      Jing Zhang authored
      510d6464