1. 12 Oct, 2021 2 commits
  2. 10 Oct, 2021 2 commits
  3. 08 Oct, 2021 4 commits
  4. 07 Oct, 2021 1 commit
  5. 06 Oct, 2021 1 commit
    • Chao Liu's avatar
      Tweak GEMM kernel (#38) · b3e8d57d
      Chao Liu authored
      * add parameters
      
      * tweak gemm
      
      * tweak
      
      * update conv
      
      * update script
      
      * adding bwd 1x1
      
      * update script
      
      * adding 1x1 bwd
      
      * debugging bwd 1x1 failure
      
      * update script
      
      * update script
      
      * test
      
      * test v100
      
      * clean up
      b3e8d57d
  6. 04 Oct, 2021 2 commits
  7. 02 Oct, 2021 4 commits
  8. 01 Oct, 2021 1 commit
  9. 30 Sep, 2021 1 commit
  10. 29 Sep, 2021 1 commit
  11. 15 Sep, 2021 4 commits
  12. 14 Sep, 2021 2 commits
  13. 13 Sep, 2021 3 commits
  14. 12 Sep, 2021 2 commits
  15. 11 Sep, 2021 3 commits
  16. 10 Sep, 2021 1 commit
  17. 09 Sep, 2021 2 commits
  18. 08 Sep, 2021 1 commit
  19. 05 Sep, 2021 1 commit
    • Chao Liu's avatar
      GEMM driver and kernel (#29) · 19613902
      Chao Liu authored
      * add gemm driver
      
      * tweak
      
      * add gemm kernel: mk_kn_mn and km_kn_mn
      
      * tweak
      
      * add GEMM km_nk_mn
      
      * fix comment
      19613902
  20. 31 Aug, 2021 1 commit
    • ltqin's avatar
      Backward weight v4r4r2 with xdlops (#18) · 627d8ef3
      ltqin authored
      
      
      * start
      
      * modify transformat
      
      * modify device convolutiion
      
      * modify host
      
      * added host conv bwd and wrw
      
      * remove bwd, seperate wrw
      
      * clean
      
      * hacall k to zero
      
      * out log
      
      * fixed
      
      * fixed
      
      * change to (out in wei)
      
      * input hack
      
      * hack to out
      
      * format
      
      * fix by comments
      
      * change wei hacks(wei transform has not merge)
      
      * fix program once issue
      
      * fix review comment
      
      * fix vector load issue
      
      * tweak
      Co-authored-by: default avatarltqin <letaoqin@amd.com>
      Co-authored-by: default avatarJing Zhang <jizhan@amd.com>
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      627d8ef3
  21. 23 Aug, 2021 1 commit
    • zjing14's avatar
      Xdlops refactor fix (#22) · 9d3f634a
      zjing14 authored
      * added constexpr ahead of adptor; clean unused driver; rename M/NPerWave to M/NPerXDL
      
      * fixed bwd
      
      * fixed comment
      9d3f634a