"tests/vscode:/vscode.git/clone" did not exist on "67d070749ae393a234470b6ef653821bb4f02cc6"
- 04 Nov, 2024 1 commit
-
-
mtgu0705 authored
Add int4+scale based on Zhang, Jing pk_i4. Compile pass, function pass.
-
- 27 Oct, 2024 1 commit
-
-
Jing Zhang authored
-
- 24 Oct, 2024 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
- 23 Oct, 2024 7 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 22 Oct, 2024 5 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 21 Oct, 2024 3 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 20 Oct, 2024 3 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 18 Oct, 2024 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
- 16 Oct, 2024 1 commit
-
-
Jing Zhang authored
-
- 15 Oct, 2024 3 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 14 Oct, 2024 1 commit
-
-
Jing Zhang authored
-
- 13 Oct, 2024 1 commit
-
-
Jing Zhang authored
-
- 11 Oct, 2024 1 commit
-
-
Jing Zhang authored
-
- 09 Oct, 2024 2 commits
-
-
Illia Silin authored
-
Christopher Millette authored
-
- 08 Oct, 2024 3 commits
-
-
Rostyslav Geyyer authored
* Add a gpu gemm reference kernel * Switch to gpu reference in gemm examples * Remove redundant arguments * Update all related examples * Update more examples * Try less threads per block * Try even less threads per block * Add support for all matrix layouts * Increase block size * Clean up * Remove hardcoded strides * Clean up * Try a column-major case * Revert back to row-major * Run both CPU and GPU veriffication --------- Co-authored-by:Po Yen Chen <PoYen.Chen@amd.com>
-
Po Yen Chen authored
* Fix text alignment of ArgParser::print() * Update example README files * Clarify make-ck-dev.sh <arch> usage * Only keep some of the argument from '-?' output * Undo command line output changes in README * Only keep existing argument on doc and update description * Fix text alignment * Make cmake-ck-*.sh compatible with 'sh' command
-
Qianfeng authored
* Simplify the codes in splitkv_combine pipeline * Always set kPadSeqLenK=true for fmha splitkv kernels * Change in Oacc Alignment and TileDistribution to be more adaptable to tile sizes --------- Co-authored-by:Po Yen Chen <PoYen.Chen@amd.com>
-
- 07 Oct, 2024 4 commits
-
-
Illia Silin authored
* add a CK_USE_CODEGEN build argument to enable codegen * fix cmake codegen logic
-
Illia Silin authored
* update build logic with GPU_ARCHS * fix the GPU_ARCHS build for codegen * unset GPU_TARGETS when GPU_ARCHS are set
-
Bartłomiej Kocot authored
Co-authored-by:Po Yen Chen <PoYen.Chen@amd.com>
-
rocking authored
* Fix compile error * Add one pass pipeline * Extract creating tile_window to operator() * clang format * reduce duplicated code * do not hardcode * Support padding in layernorm --------- Co-authored-by:Po Yen Chen <PoYen.Chen@amd.com>
-