- 10 Aug, 2021 2 commits
- 09 Aug, 2021 8 commits
- 08 Aug, 2021 1 commit
-
-
Chao Liu authored
-
- 07 Aug, 2021 2 commits
- 06 Aug, 2021 7 commits
- 30 Jul, 2021 3 commits
- 28 Jul, 2021 3 commits
- 27 Jul, 2021 3 commits
-
-
Chao Liu authored
fix building issue
-
Chao Liu authored
-
Chao Liu authored
* update online kernel wrapper bundle all descriptors in a tuple * change __CONSTANT__ to CONSTANT * rename * adding tuning * added IsValidCompileParameter * reorginze * adding tunable for fp16 and int8 * fix kernel compile warning and bug fixes * suppress warning about cast CONSTANT (address space 4) pointer * fix building issue
-
- 18 Jul, 2021 1 commit
-
-
Chao Liu authored
* change olc cmake * adding online compile to fwd-v4r5r2 * update scripts * remane fwd-v4r5r2 to fwd-v6r1 * clean up
-
- 17 Jul, 2021 2 commits
-
-
zjing14 authored
* init for v4r4 xdlops olc * refactor wrap * init impl of v4r4 nchw xdlops olc * tuning * test perf * fixed v4r4 nhwc * tuned v4r4 nhwc * use gridwise_gemm_xdlops_v2r3 * swap a/b * add pointer support into offline v2r3 * debugging v4r4r4 transform for olc * change timer of olc * refactor v4r4 xdlops nchw olc * remove transform fun in v4r4 xdlops nhwc olc Co-authored-by:Chao Liu <chao.liu2@amd.com>
-
Chao Liu authored
* change init method
-
- 16 Jul, 2021 1 commit
-
-
Chao Liu authored
-
- 09 Jul, 2021 1 commit
-
-
Chao Liu authored
-
- 08 Jul, 2021 4 commits
- 05 Jul, 2021 1 commit
-
-
Chao Liu authored
* add threadwise copy the copy a tensor in one copy, added kpack to DL GEMM * add kpack into fwd v4r5 nchw fp32
-
- 01 Jul, 2021 1 commit
-
-
Chao Liu authored
-