- 28 Jul, 2023 1 commit
-
-
aska-0096 authored
-
- 25 Jul, 2023 1 commit
-
-
aska-0096 authored
-
- 20 Jul, 2023 1 commit
-
-
aska-0096 authored
-
- 31 May, 2023 1 commit
-
-
Illia Silin authored
-
- 10 May, 2023 1 commit
-
-
aska-0096 authored
2. Enable FP16 accumulator blockwise_gemm
-
- 06 Mar, 2023 1 commit
-
-
aska-0096 authored
-
- 24 Feb, 2023 1 commit
-
-
aska-0096 authored
-
- 15 Dec, 2022 2 commits
- 09 Dec, 2022 1 commit
-
-
aska-0096 authored
-
- 02 Dec, 2022 1 commit
-
-
aska-0096 authored
-
- 11 Nov, 2022 1 commit
-
-
Po Yen Chen authored
* Rangify STL algorithms This commit adapts rangified std::copy(), std::fill() & std::transform() * Rangify check_err() By rangifying check_err(), we can not only compare values between std::vector<>s, but also compare any ranges which have same value type. * Allow constructing Tensor<> like a HostTensorDescriptor * Simplify Tensor<> object construction logics * Remove more unnecessary 'HostTensorDescriptor' objects * Re-format example code * Re-write more HostTensorDescriptor ctor call
-
- 10 Nov, 2022 1 commit
-
-
Po Yen Chen authored
Allow passing forward range to its call operator
-
- 01 Sep, 2022 1 commit
-
-
Chao Liu authored
* refactor * refactor * adding int4/int8/fp16/bf16 for conv+conv and gemm+gemm * adding int4/int8/fp16/bf16 for conv+conv and gemm+gemm * clean
-
- 23 Aug, 2022 1 commit
-
-
Po Yen Chen authored
* Add GEMM examples for int4 Currently the source files are just copied from int8 examples * Re-use pre-defined alias in int4 exmples * Distinguish user-side type from kernel-side type * Add int4_t support for check_err() * Allow conversion between Tensor<> specializations * Re-format source files * Use different type for host tensors * Re-use CopyAsType<>() to implement copy ctor * Re-use element-wise operation type alias * Fix typo in alias names * Complete the int4 examples * Add constraint to Tensor<> templated methods * Add type traits 'is_signed_integral<>' * Add type constraints for integer version check_err<>() * Allow comparing different-sized integral types in check_err() * Check converted Tensor<int4_t> with golden Tensor<int8_t> * Remove constraint of Tensor<>::CopyAsType() * Avoid compilation error while disabling ck::int4_t support * Remove debug messages * Add #error directive to prevent compile sources with wrong setting * Simplify tensor usages in examples * Add constraint to check_err() input reference type * Align design with other PR * Use ""_uz to simplify example code * Avoid too much generalizing check_err() * Re-format GEMM instance template arguments * Extract int4 example common codes * Sort include directives * Move #include directives into new header * Move common codes together * Re-format template argument in example code * Reuse same implementation code for most of GEMM examples * Re-format common.hpp * Unify structured comment in examples * Use reinterpret_cast<>() for cross-type pointer conversion * Revert "Add type traits 'is_signed_integral<>'" This reverts commit f2c148efaedf42c8ee66032dac6d13a1003b0f3a. * Allow unsigned integer arguments for check_err() * Fix compilation error in check_err() * Remove unnecessary copy ctor for Tensor<> * Mark Tensor<> special member functions as 'default' * Use more strict condition to add code in examples * Fix wrong program return value of GEMM examples * Handle the case while user specify all the strides * Fix never-ran examples * Exit successfully if GEMM instance does not support given problem * Add missing 'else' keyword * Re-format CMakeLists.txt * Add wrapper function to hide value conversion while copying memory * Add new DeviceMem API to copy memory * Use new DeviceMem API to implement examples * Revert "Add new DeviceMem API to copy memory" This reverts commit 3f190b0779ceedf7aaf0b380712fda0518de72c1. * Add conversion ctor for Tensor<> * Write Tensor<> conversion logics explicitly in example code * Convert Tensor<> values after transfer data to host
-