1. 03 Oct, 2023 2 commits
  2. 20 Sep, 2023 2 commits
  3. 31 Aug, 2023 1 commit
    • zjing14's avatar
      Grouped Gemm with Fixed K and N with SplitK (#818) · f5ec04f0
      zjing14 authored
      
      
      * move all arguments into device
      
      * add b2c_tile_map
      
      * add examples
      
      * add SetDeviceKernelArgs
      
      * dedicated fixed_nk solution
      
      * init client api
      
      * add grouped_gemm_bias example
      
      * add a instance
      
      * add instances
      
      * formatting
      
      * fixed cmake
      
      * Update EnableCompilerWarnings.cmake
      
      * Update cmake-ck-dev.sh
      
      * clean; fixed comments
      
      * fixed comment
      
      * add instances for fp32 output
      
      * add instances for fp32 output
      
      * add fp32 out client example
      
      * fixed CI
      
      * init commit for kbatch
      
      * add splitk gridwise
      
      * format
      
      * fixed
      
      * clean deviceop
      
      * clean code
      
      * finish splitk
      
      * fixed instances
      
      * change m_loops to tile_loops
      
      * add setkbatch
      
      * clean code
      
      * add splitK+bias
      
      * add instances
      
      * opt mk_nk instances
      
      * clean examples
      
      * fixed CI
      
      * remove zero
      
      * finished non-zero
      
      * clean
      
      * clean code
      
      * optimized global_barrier
      
      * fixed ci
      
      * fixed CI
      
      * removed AddBias
      
      * format
      
      * fixed CI
      
      * fixed CI
      
      * move 20_grouped_gemm to 21_grouped_gemm
      
      ---------
      Co-authored-by: default avatarJing Zhang <jizha@amd.com>
      f5ec04f0
  4. 16 Jun, 2023 1 commit
  5. 15 Jun, 2023 1 commit
    • Illia Silin's avatar
      Enable gfx941 and gfx942 architectures. (#752) · 027e46ee
      Illia Silin authored
      * enable gfx941/942 targets
      
      * fix clang format
      
      * fix the cmake logic for multiple targets
      
      * fix cmake syntax for looping over targets
      
      * add gfx941/942 support for gemm_xdl instances
      027e46ee
  6. 28 Apr, 2023 1 commit
  7. 29 Mar, 2023 1 commit
  8. 15 Mar, 2023 1 commit
  9. 24 Feb, 2023 1 commit
  10. 15 Feb, 2023 1 commit
  11. 02 Nov, 2022 1 commit
    • rocking5566's avatar
      Conv perlayer int8 quantization (#471) · 226bc02b
      rocking5566 authored
      * Add conv2d requant example
      
      * Fix bash error
      
      * Rename example
      
      * 1. Rename gemm quantization
      2. shares the requantization lambda function with conv
      
      * Refine declare type
      
      * Add conv bias relu quantization exmaple
      
      * clang format
      
      * Fix compile error due to merge develop
      
      * Fix CI error
      
      * Extract quantization post operation into another file
      
      * Support quantization for non piecewise linear function
      
      * Add instance for conv quantization
      
      * Add convolution quantization factory
      
      * Add convolution quantization client example
      
      * Add more instances with different template parameters
      
      * clang format
      
      * Sync the naming with the develop
      226bc02b
  12. 03 Oct, 2022 1 commit
    • Chao Liu's avatar
      update document: Readme, contributors, citation, (#463) · 473ba5bc
      Chao Liu authored
      * update cmake script
      
      * update readme
      
      * Update README.md
      
      * add citation
      
      * add images
      
      * Update README.md
      
      * update
      
      * Update README.md
      
      * Update CONTRIBUTORS.md
      
      * Update README.md
      
      * Update CITATION.cff
      
      * Update README.md
      
      * Update CITATION.cff
      473ba5bc