- 01 Nov, 2021 1 commit
-
-
Jing Zhang authored
-
- 29 Oct, 2021 6 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 28 Oct, 2021 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
- 27 Oct, 2021 3 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 22 Oct, 2021 1 commit
-
-
Jing Zhang authored
-
- 15 Oct, 2021 3 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 14 Oct, 2021 5 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 13 Oct, 2021 1 commit
-
-
Jing Zhang authored
-
- 12 Oct, 2021 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
- 11 Oct, 2021 1 commit
-
-
Jing Zhang authored
-
- 10 Oct, 2021 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
- 08 Oct, 2021 4 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 07 Oct, 2021 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
- 06 Oct, 2021 3 commits
-
-
Qianfeng authored
* Tiny fix in using data type template parameters in blockwise and direct_threadwise kernel * Fix with regard to implementing GetZeroVal() in both kernel and host * Avoid convert to compType from dstDataType before writting the output value * Add half_t support to NumericLimits and make constexpr GetZeroVal() of binary operator * Add CONSTANT decorator for descriptor read buffer * Use get_thread_local_1d_id() for thread local Id * Rename GetZeroVal() to GetReductionZeroVal() in the kernels * Remove constexpr from initialized zeroVal and tiny fix in reduction_operator.hpp * Occasional tiny simplification and update in the kernel files * Update to re-order tensor dimensions on the host, split second_call kernel wrapper files and simplify reduce_all kernel wrappers * Update to remove OpenCL tidy checking failures * Update for better readability * Remove unused codes and not-needed template parameters in the kernel wrappers Co-authored-by:Chao Liu <chao.liu2@amd.com>
-
Chao Liu authored
* add parameters * tweak gemm * tweak * update conv * update script * adding bwd 1x1 * update script * adding 1x1 bwd * debugging bwd 1x1 failure * update script * update script * test * test v100 * clean up
-
zjing14 authored
* init StaticBufferV2 * clean * adopt old output stage for staticBufferV2 * clean * remove hack * clean * clean * clean code * move c_buffer alloc into blockwise gemm * add adaptors for m/n_thread_data_on_grid * adjust blockwise_gemm_xdlops * reorder ops in GEMM hot loop Co-authored-by:Chao Liu <chao.liu2@amd.com>
-
- 04 Oct, 2021 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
- 02 Oct, 2021 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-