- 27 Aug, 2021 1 commit
-
-
Chao Liu authored
* use cast_pointer_to_generic_address_space() in v6r1 kernel wrapper, DynamcBuffer and buffer_load take customized invalid-element-value, add buffer_load/store for fp64 * use remove_cvref_t
-
- 25 Aug, 2021 1 commit
-
-
zjing14 authored
* add f32/i32 atomicAdd support into dynamicBuffer, and enable it in v1r3 * fixed * fixed * update comment Co-authored-by:Chao Liu <chao.liu2@amd.com>
-
- 16 Aug, 2021 1 commit
-
-
Chao Liu authored
-
- 13 Aug, 2021 1 commit
-
-
Chao Liu authored
-
- 10 Aug, 2021 2 commits
- 09 Aug, 2021 1 commit
-
-
Chao Liu authored
-
- 27 Jul, 2021 1 commit
-
-
Chao Liu authored
* update online kernel wrapper bundle all descriptors in a tuple * change __CONSTANT__ to CONSTANT * rename * adding tuning * added IsValidCompileParameter * reorginze * adding tunable for fp16 and int8 * fix kernel compile warning and bug fixes * suppress warning about cast CONSTANT (address space 4) pointer * fix building issue
-
- 05 Jul, 2021 1 commit
-
-
Chao Liu authored
* add threadwise copy the copy a tensor in one copy, added kpack to DL GEMM * add kpack into fwd v4r5 nchw fp32
-
- 12 May, 2021 1 commit
-
-
Chao Liu authored
* Use DynamicBuffer to hold raw pointer (to global and LDS memory) * add workaround for compiler issue (inefficient ISA) of ds_write for int8x4, int8x8, int8x16
-