- 18 Oct, 2022 1 commit
-
-
wangshaojie6 authored
-
- 17 Oct, 2022 1 commit
-
-
wangshaojie6 authored
-
- 02 Sep, 2022 1 commit
-
-
wangshaojie6 authored
-
- 14 Jul, 2022 1 commit
-
-
wangshaojie6 authored
-
- 10 Jul, 2022 1 commit
-
-
Wenkai authored
-
- 04 Jul, 2022 1 commit
-
-
Wenkai authored
-
- 18 Jun, 2022 3 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 17 Jun, 2022 3 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 16 Jun, 2022 3 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 14 Jun, 2022 2 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 13 Jun, 2022 3 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 12 Jun, 2022 2 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 11 Jun, 2022 1 commit
-
-
wangshaojie6 authored
-
- 10 Jun, 2022 2 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 09 Jun, 2022 2 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 08 Jun, 2022 4 commits
-
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
wangshaojie6 authored
-
- 02 Jun, 2022 2 commits
-
-
Shaojie WANG authored
-
Qianfeng authored
* Use the unified naming for math functions on host and HIP kernel * Corresponding change/simplification in reduction host/profiler/examples due to unified math functions renaming * Renaming GetReductionZeroVal() to GetIdentityValue() * Tiny renaming in profile_reduce_impl.hpp * More renaming in profile_reduce_impl.hpp * Replace zeroVal by identiyVal * Remove ck_ prefix in the naming of ck::math provided functions
-
- 01 Jun, 2022 1 commit
-
-
ltqin authored
-
- 31 May, 2022 3 commits
-
-
zjing14 authored
* moved gemm_descs_args into const buff * use CK_CONSTANT_ADDRESS_SPACE instead of global constant * clean * moved hipMemAlloc outside of deviceOp * add SetWorkSpacePointer * fix ignore
-
myamlak authored
* Reference CGEMM + test stub * Format. * Incomplete simple implementation * Library instances * Sketch of tests * Test fixes. * Example added * Cosmetics * Add elementwise operation kernel and example * Add comment * Add template argument of dim . Prepare to support multiple dimension * Rename example * Support 1 dimension * Add static assert * Add comment * Second auxiliary buffer added * Extract pad * Remove redundant argument * Support any dimension for elementwise operation * Remove line * Let it be the multiple number of CU * Move thread per block to the parameter of constructor * Consuming binary ops to do A+B / A-B * Fix + cosmetics + bf16 test commented out temporarily * Format * Enabling bf16 test * Revert "Enabling bf16 test" This reverts commit f497e2ba. * Fix + test reenabled * fix build * Revert "fix build" This reverts commit d7310238 . * post PR #235 merge fix * amend * Single workspace for cgemm + helper * Perf calc fix * Review remarks: static_cast * Review remarks: binary ops templated * Cleaning * Removal of instances and their tests * Review remarks from aosew addressed * Review remark: unnecessary attribute * Post-merge fixes * Restrict 4gemm to PassThrough + bug fix * Review remarks * update licence * change cgemm example to fp16 Co-authored-by:
rocking <chunylai@amd.com> Co-authored-by:
Chao Liu <chao.liu2@amd.com> Co-authored-by:
Anthony Chang <ac.chang@outlook.com>
-
Chao Liu authored
* fix example * update IsSupportedArgument * fix * disable fp64 conv example as test
-
- 30 May, 2022 3 commits
-
-
rocking5566 authored
* Implement reduction meand and reduction square mean * Refine file name * Add reduce mean and square mean * Fix parameter name * Add normalize device op (not implement invoker::run()) * Remove epislon * Refine deviceop * Add 5ary elementwise for normalization * Add layernorm example * layerNorm verication * Fix compiler error due to merge from develop * Fix typo * Fix compile error * Refine naming * [What] Suport non pointer for invoker and argument [Why] Snyc coding style with gemm * Refine folder name * Refine class name * Evaluate perf of the kernel * Fix compile error * [What] Refine perf evaluation in example of gemm + reduction [Why] evaluation of gemm + reduction may cause verification fail. Because evaluation will not initial global memory * clang-format
-
ltqin authored
-
ltqin authored
-