- 28 Nov, 2024 3 commits
- 27 Nov, 2024 3 commits
- 26 Nov, 2024 3 commits
- 25 Nov, 2024 1 commit
-
-
letaoqin authored
-
- 22 Nov, 2024 4 commits
- 21 Nov, 2024 1 commit
-
-
“letaoqin” authored
-
- 20 Nov, 2024 1 commit
-
-
“letaoqin” authored
-
- 19 Nov, 2024 1 commit
-
-
“letaoqin” authored
-
- 16 Nov, 2024 1 commit
-
-
“letaoqin” authored
-
- 15 Nov, 2024 1 commit
-
-
letaoqin authored
-
- 14 Nov, 2024 5 commits
- 13 Nov, 2024 3 commits
-
-
carlushuang authored
-
carlushuang authored
-
carlushuang authored
-
- 12 Nov, 2024 3 commits
-
-
Illia Silin authored
-
carlushuang authored
-
Thomas Ning authored
* Finished the feature * Modified the test file * Test case update * addresss comment * Addressed the review comment * Fixed the CI error
-
- 11 Nov, 2024 6 commits
-
-
Illia Silin authored
-
valarLip authored
* [CK_TILE] add more stride for layernorm to support un-continuous Tensor * align CK coding style * extend strides to layernrom expample * clang-format...
-
carlushuang authored
-
carlushuang authored
-
carlushuang authored
-
Po Yen Chen authored
-
- 09 Nov, 2024 2 commits
-
-
dummycoderfe authored
* add moe_sorting & check ok * fix comments & typo * Run remod.py under include/ck_tile & example/ck_tile directories * format codes * fix output ci check bug * fix moe sorting readme and error commit file * use magiv div to accelerate compute * add an loop unroll for moe lds ops * add extblocksnel to set zeros for moebufs * [Ck_tile] moe set zero run ok, add size check and fix ref check * [Ck_tile]fix moe_sorting fuse set_zero remod * [Ck_tile] change name style, fix zero buffer size err, change folder * [Ck_tile] moe_sorting: fix name style * [Ck_tile] moe_sorting, remove useless params in traits * [Ck_tile] change outputtile cnt * unit_size; change output buf alloc --------- Co-authored-by:
dummycoderfe <noplydummmycoder@163.com> Co-authored-by:
Po Yen, Chen <PoYen.Chen@amd.com> Co-authored-by:
carlushuang <carlus.huang@amd.com>
-
Po Yen Chen authored
-
- 08 Nov, 2024 2 commits
-
-
Bartłomiej Kocot authored
* Add generic instances for two stage conv bwd wei * Update layout prefix
-
dummycoderfe authored
* optimze small N case using vec io and using rcp div * [Ck_tile] layernorm, add param to control fastdiv; change generate codes and test pass * [Ck_tile] fix blockSize compute in Generic2dBlockShape * [Ck_tile]fix kfastfdiv template style * [Ck_tile] layernorm, fix stype in review --------- Co-authored-by:dummycoderfe <noplydummmycoder@163.com>
-