1. 01 Oct, 2023 1 commit
  2. 30 Sep, 2023 1 commit
  3. 28 Sep, 2023 4 commits
  4. 27 Sep, 2023 5 commits
  5. 26 Sep, 2023 4 commits
  6. 23 Sep, 2023 1 commit
  7. 22 Sep, 2023 2 commits
  8. 21 Sep, 2023 4 commits
    • Jing Zhang's avatar
      add examples of multiA and broadcast · f60ad8b9
      Jing Zhang authored
      f60ad8b9
    • Illia Silin's avatar
      Refactoring cmake files to build data types separately. (#932) · bba085d2
      Illia Silin authored
      * refactor cmake files for the tests
      
      * refactor cmake files for examples
      
      * fix cmake for gemm example
      
      * fix the cmake file for all examples
      
      * add splitting by data types in gemm_splitk instance header
      
      * rename test to reflect only dl instances are used
      
      * clean up CI workspace, update cmake for instances
      
      * change the jenkinsfile syntax
      
      * build all instances except DL on gfx11
      
      * move workspace cleanup after stages
      
      * clean up workspace after every stage
      
      * isolate data types in grouped_conv_fwd header
      
      * isolate dl instances for grouped_conv2d_fwd
      
      * fix syntax
      
      * fix cmake and batchnorm instances
      
      * fix typo
      
      * fix reduction instances
      
      * fix grouped_conv headers
      
      * fix syntax
      
      * replace parsing logic for instances, replace bfp16 with bf16
      
      * fix the client examples build
      
      * clean up DTYPES from instances cmake files
      
      * update the parsing logic in cmake files
      
      * make an exception for reduction kernels
      
      * update few remaining cmake files to handle DTYPES
      
      * fix syntax
      
      * fix cmake conflicts
      
      * replace f8 with fp8 test name
      
      * resolve conflicts for dpp instances
      bba085d2
    • Jing Zhang's avatar
      add examples · 0512580c
      Jing Zhang authored
      0512580c
    • Jing Zhang's avatar
      init commit for contraction_multi_ABD · a8780c32
      Jing Zhang authored
      a8780c32
  9. 20 Sep, 2023 2 commits
  10. 19 Sep, 2023 3 commits
  11. 18 Sep, 2023 3 commits
  12. 17 Sep, 2023 3 commits
  13. 16 Sep, 2023 2 commits
  14. 15 Sep, 2023 5 commits
    • Bartlomiej Kocot's avatar
      Stylistic improvements for grouped convolution code · bc2d0583
      Bartlomiej Kocot authored
      Remove unnecessary ignoring
      
      Update test/grouped_convnd_bwd_weight/test_grouped_convnd_bwd_weight.cpp
      bc2d0583
    • Jing Zhang's avatar
      allow packed elementwise_op · b7bc3c2b
      Jing Zhang authored
      b7bc3c2b
    • Jing Zhang's avatar
      add multiABD example · d61d9edf
      Jing Zhang authored
      d61d9edf
    • zjing14's avatar
      Merge branch 'develop' into ab_blk_copy_multi_d · 5b48bf69
      zjing14 authored
      5b48bf69
    • zjing14's avatar
      Add fp16/fp8 support into Grouped gemm FixedNK (#874) · f9d0eddb
      zjing14 authored
      
      
      * move all arguments into device
      
      * add b2c_tile_map
      
      * add examples
      
      * add SetDeviceKernelArgs
      
      * dedicated fixed_nk solution
      
      * init client api
      
      * add grouped_gemm_bias example
      
      * add a instance
      
      * add instances
      
      * formatting
      
      * fixed cmake
      
      * Update EnableCompilerWarnings.cmake
      
      * Update cmake-ck-dev.sh
      
      * clean; fixed comments
      
      * fixed comment
      
      * add instances for fp32 output
      
      * add instances for fp32 output
      
      * add fp32 out client example
      
      * fixed CI
      
      * init commit for kbatch
      
      * add splitk gridwise
      
      * format
      
      * fixed
      
      * clean deviceop
      
      * clean code
      
      * finish splitk
      
      * fixed instances
      
      * change m_loops to tile_loops
      
      * add setkbatch
      
      * clean code
      
      * add splitK+bias
      
      * add instances
      
      * opt mk_nk instances
      
      * clean examples
      
      * fixed CI
      
      * remove zero
      
      * finished non-zero
      
      * clean
      
      * clean code
      
      * optimized global_barrier
      
      * fixed ci
      
      * fixed CI
      
      * instance and client
      
      * removed AddBias
      
      * format
      
      * fixed CI
      
      * fixed CI
      
      * move 20_grouped_gemm to 21_grouped_gemm
      
      * clean
      
      * formatting
      
      * clean
      
      * clean
      
      * fixed computeType
      
      ---------
      Co-authored-by: default avatarJing Zhang <jizha@amd.com>
      f9d0eddb