- 21 Nov, 2024 2 commits
-
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
- 20 Nov, 2024 2 commits
-
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
Removed multiple negations in fail/pass logic to propagate `true` as the success indicator.
-
- 19 Nov, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 14 Nov, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 07 Nov, 2024 1 commit
-
-
Illia Silin authored
-
- 04 Nov, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 31 Oct, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 30 Oct, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 29 Oct, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 26 Oct, 2024 1 commit
-
-
valarLip authored
* add int8 gemm multiply multiply a8w8 * uncomment * clang-format-12 * Add example_gemm_multiply_multiply_xdl_int8 * Remove shell scripts * update preprocess number for mi308; bring back printout in ckprofiler * format --------- Co-authored-by:
chenjun <junchen2@amd.com> Co-authored-by:
Haocong WANG <haocwang@amd.com> Co-authored-by:
carlushuang <carlus.huang@amd.com>
-
- 25 Oct, 2024 1 commit
-
-
aledudek authored
* Calculate generic relative threshold pool3dfwd * Calculate absolute error threshold pool3d fwd * Generic threshold calculation take max input for relative error pool3dfwd * Remove max possible value for error calculation at runtime * Remove debug print in pool3dfwd * Pool3d fwd adjusted types in generic threshold calculation * Generic threshold calculation take into account number of accumulations and accdatatype * Generic threshold fix final error formula * Generic threshold calculation - num of accs fix * Generic threshold calculation - adjust absolute error * Generic threshold calculation - OutDataType in absolute error
-
- 23 Oct, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 22 Oct, 2024 1 commit
-
-
Jatin Chaudhary authored
Co-authored-by:Illia Silin <98187287+illsilin@users.noreply.github.com>
-
- 21 Oct, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 18 Oct, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 16 Oct, 2024 2 commits
-
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
- 15 Oct, 2024 3 commits
-
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
- 14 Oct, 2024 3 commits
-
-
illsilin authored
-
Rostyslav Geyyer authored
* Add non_native_vector_type * Add a test * Add non-native vector type * Fix CTOR * Fix non-native vector type of 1 * Fix CTORs * Use vector_type to cover non-native implementation as well * Update the test * Format * Format * Fix copyright years * Remove BoolVecT so far * Add AsType test cases * Update assert error message * Remove redundant type * Update naming * Add complex half type with tests * Add tests for vector reshaping * Add missing alignas * Update test/data_type/test_custom_type.cpp Co-authored-by:
Adam Osewski <19374865+aosewski@users.noreply.github.com> * Compare custom types to built-in types * Add default constructor test * Add an alignment test --------- Co-authored-by:
Illia Silin <98187287+illsilin@users.noreply.github.com> Co-authored-by:
Adam Osewski <19374865+aosewski@users.noreply.github.com> Co-authored-by:
Po Yen Chen <PoYen.Chen@amd.com>
-
Bartłomiej Kocot authored
* Add transpose scale amax example * fixes * Tune reduce instance
-
- 11 Oct, 2024 3 commits
-
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
- 10 Oct, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 03 Oct, 2024 1 commit
-
-
Andriy Roshchenko authored
-
- 20 Sep, 2024 1 commit
-
-
Adam Osewski authored
The dynamic buffer doesn't have support for fp8 in `Update` operation thus fp8 is not supporting `InMemoryDataOperation::Add`
-
- 12 Sep, 2024 1 commit
-
-
Mateusz Ozga authored
* Add pool2d instance BWD AVG * Add pool2d instance BWD MAX * Fix: avg review * Fix review: part2 * Fix - enable test when type is compiled * Fix review part3
-
- 11 Sep, 2024 1 commit
-
-
jakpiase authored
* Implemented smfmac xdlops * Added smfmac blockwise xdlops * fixes * add reviewers suggestions --------- Co-authored-by:Adam Osewski <19374865+aosewski@users.noreply.github.com>
-
- 14 Aug, 2024 1 commit
-
-
Haocong WANG authored
* replace buffer_atomic with global_atomic * fixed global_atomic_add * added bf16 atomic_add * format * clang-format-12 * clean * clean * add guards * Update gtest.cmake * enabled splitk_gemm_multi_d * format * add ckProfiler * format * fixed naming * format * clean * clean * add guards * fix clang format * format * add kbatch printout * clean * Add rocm6.2 related gemm optimization * Limit bf16 atomic usage * remove redundant RCR gemm_universal instance * Add RRR fp8 gemm universal instance * Bug fix * Add GPU_TARGET guard to FP8/BF8 target * bug fix * update cmake * remove all fp8/bf8 example if arch not support * Enable fp8 RRR support in ckProfiler * limit greedy-reverse flag to gemm_universal in ckProfiler --------- Co-authored-by:
Jing Zhang <jizhan@fb.com> Co-authored-by:
Jing Zhang <jizhan@meta.com> Co-authored-by:
zjing14 <zhangjing14@gmail.com> Co-authored-by:
Illia Silin <98187287+illsilin@users.noreply.github.com> Co-authored-by:
illsilin <Illia.Silin@amd.com>
-
- 07 Aug, 2024 1 commit
-
-
Juan Manuel Martinez Caamaño authored
* Remove reinterpret_cast uses that result in undefined behaviour. Use a bitcast instead. See https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_accessibility Closes #1439 * fix clang format --------- Co-authored-by:
illsilin <Illia.Silin@amd.com>
-
- 06 Aug, 2024 2 commits
-
-
Juan Manuel Martinez Caamaño authored
-
Bartłomiej Kocot authored
* Support 64 bit indexing * Add new grouped conv fwd kernel for large tensors * Add instances large tensor * Fixes for transform conv to gemm * Fixes * fixes * Remove not needed instances * examples fixes * Remove not need ds arrays * Fix tests * Add 2GB check in gridwise dl * Fixes
-
- 24 Jul, 2024 1 commit
-
-
Bartłomiej Kocot authored
* Add support for half_t and bfloat to reduction operations * Fix bhalf convert * Next fix bf16
-
- 17 Jul, 2024 1 commit
-
-
Qianfeng authored
-
- 04 Jul, 2024 1 commit
-
-
Jun Liu authored
-