- 12 Jun, 2022 1 commit
-
-
wangshaojie6 authored
-
- 11 Jun, 2022 4 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 10 Jun, 2022 7 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 09 Jun, 2022 4 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 08 Jun, 2022 4 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 01 Jun, 2022 2 commits
- 31 May, 2022 2 commits
- 30 May, 2022 4 commits
-
-
rocking5566 authored
* Implement reduction meand and reduction square mean * Refine file name * Add reduce mean and square mean * Fix parameter name * Add normalize device op (not implement invoker::run()) * Remove epislon * Refine deviceop * Add 5ary elementwise for normalization * Add layernorm example * layerNorm verication * Fix compiler error due to merge from develop * Fix typo * Fix compile error * Refine naming * [What] Suport non pointer for invoker and argument [Why] Snyc coding style with gemm * Refine folder name * Refine class name * Evaluate perf of the kernel * Fix compile error * [What] Refine perf evaluation in example of gemm + reduction [Why] evaluation of gemm + reduction may cause verification fail. Because evaluation will not initial global memory * clang-format
-
ltqin authored
-
ltqin authored
-
ltqin authored
-
- 29 May, 2022 2 commits
- 28 May, 2022 4 commits
- 27 May, 2022 2 commits
-
-
Chao Liu authored
* debugging conv * fix oversight where ctile map is constructed before initializing c desc * example program should returns error code * clean up * changed Block2CTileMap in conv2d and convnd * clean up * clean up * cleanup Co-authored-by:Anthony Chang <ac.chang@outlook.com>
-
ltqin authored
-
- 26 May, 2022 3 commits
-
-
ltqin authored
* add intrin_mfma_f64_16x16x4f64 * add example * gemm reference add double data type * chang init data * fix M N PerXdlops * fix ifdef * add comparsion config * add conv fwd example * format log out * change rc matrix egister layout * reorganize example * reorganize example 2 * format,because merge develop * fix call impl adding acc data type * lost ; * add compiler warning * change example tunning parameters * add test for fp64 * add instance * add test/gemm/gemm_fp64.cpp * fix get name issue * remove some tunning parameter * fix conflict * format * use integer value for GEMM test * add acc data type * remove typeid because fp16 * fix streamconfig etc bug from merging develop * format * remove test_gemm_xdl_fp64 * add AccDataType * AccDataType problem Co-authored-by:
qinletao <letaoqin@amd.com> Co-authored-by:
Chao Liu <chao.liu2@amd.com>
-
Qianfeng authored
* Add example for computing LayerNorm mean and meansquare * Refactor the pool2d_fwd example and add example for float type testing * Revert "Add example for computing LayerNorm mean and meansquare" This reverts commit df52e6f9d897b00c981baa48f291450bcd60925d. * Tiny fix in pool2d_fwd_common.hpp
-
ltqin authored
-
- 25 May, 2022 1 commit
-
-
rocking5566 authored
* Support different length of ScalarPerVector * Add example of broadcast on fastest axis * Typo * Refine fastest example * Add dimension check * Modify fastest broadcast example to 3d * Enforce users give scalarPerVector explicitely * 1. Add CscalarPerVedctor 2. Not only broadcast on fastest need to set scalarPerVector to 1 * Rename var * Move IsScalarPerVectorValid() inside IsSupportedArgument() * Separate GridDesc_M0 into A, B and C * rename var * Rename var of length Co-authored-by:rocking <chunylai@amd.com>
-