"magic_pdf/vscode:/vscode.git/clone" did not exist on "dfd13fa2aba1f1f9a0dde9574420e7c4e792e4d0"
  1. 13 Oct, 2021 1 commit
  2. 12 Oct, 2021 2 commits
  3. 11 Oct, 2021 1 commit
  4. 10 Oct, 2021 2 commits
  5. 08 Oct, 2021 4 commits
  6. 07 Oct, 2021 2 commits
  7. 06 Oct, 2021 3 commits
    • Qianfeng's avatar
      [MIOpen Downstream] Fix Reduction Kernel (#34) · b2dc55f8
      Qianfeng authored
      
      
      * Tiny fix in using data type template parameters in blockwise and direct_threadwise kernel
      
      * Fix with regard to implementing GetZeroVal() in both kernel and host
      
      * Avoid convert to compType from dstDataType before writting the output value
      
      * Add half_t support to NumericLimits and make constexpr GetZeroVal() of binary operator
      
      * Add CONSTANT decorator for descriptor read buffer
      
      * Use get_thread_local_1d_id() for thread local Id
      
      * Rename GetZeroVal() to GetReductionZeroVal() in the kernels
      
      * Remove constexpr from initialized zeroVal and tiny fix in reduction_operator.hpp
      
      * Occasional tiny simplification and update in the kernel files
      
      * Update to re-order tensor dimensions on the host, split second_call kernel wrapper files and simplify reduce_all kernel wrappers
      
      * Update to remove OpenCL tidy checking failures
      
      * Update for better readability
      
      * Remove unused codes and not-needed template parameters in the kernel wrappers
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      b2dc55f8
    • Chao Liu's avatar
      Tweak GEMM kernel (#38) · b3e8d57d
      Chao Liu authored
      * add parameters
      
      * tweak gemm
      
      * tweak
      
      * update conv
      
      * update script
      
      * adding bwd 1x1
      
      * update script
      
      * adding 1x1 bwd
      
      * debugging bwd 1x1 failure
      
      * update script
      
      * update script
      
      * test
      
      * test v100
      
      * clean up
      b3e8d57d
    • zjing14's avatar
      Add VectorType support into StaticBuffer (#27) · 846f462b
      zjing14 authored
      
      
      * init StaticBufferV2
      
      * clean
      
      * adopt old output stage for staticBufferV2
      
      * clean
      
      * remove hack
      
      * clean
      
      * clean
      
      * clean code
      
      * move c_buffer alloc into blockwise gemm
      
      * add adaptors for m/n_thread_data_on_grid
      
      * adjust blockwise_gemm_xdlops
      
      * reorder ops in GEMM hot loop
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      846f462b
  8. 04 Oct, 2021 2 commits
  9. 02 Oct, 2021 4 commits
  10. 01 Oct, 2021 1 commit
  11. 30 Sep, 2021 1 commit
  12. 29 Sep, 2021 1 commit
  13. 21 Sep, 2021 3 commits
  14. 17 Sep, 2021 1 commit
  15. 15 Sep, 2021 4 commits
  16. 14 Sep, 2021 2 commits
  17. 13 Sep, 2021 3 commits
  18. 12 Sep, 2021 2 commits
  19. 11 Sep, 2021 1 commit