1. 24 Jan, 2024 1 commit
    • Illia Silin's avatar
      Fixing most of the cppcheck errors. (#1142) · 180e5720
      Illia Silin authored
      * fix cppcheck errors, first pass
      
      * fix format
      
      * fix returned value in examples
      
      * add macro definitions for cppcheck
      
      * fix the profile_gemm logic
      
      * update the gemm profiler logic
      
      * add more difinitions to cppcheck, fix couple more errors
      
      * replace runtime error with message in device function
      
      * fix a couple of int4 issues
      
      * no return for fill function
      
      * fix errors in data_types.hpp
      
      * fix format
      
      * fix few remaining errors
      
      * fix errors in data_types.hpp
      
      * fix last couple of errors in datat_types.hpp
      180e5720
  2. 31 May, 2023 1 commit
  3. 13 Oct, 2022 1 commit
    • Adam Osewski's avatar
      Refactor device op implementations into `impl` subdirectory. (#420) · 30480288
      Adam Osewski authored
      
      
      * Move kernel implementation files under impl directory.
      
      * Update examples paths.
      
      * Update device kernel impl include paths.
      
      * Update tensor operation instances include paths.
      
      * Update profiler and tests include paths.
      
      * Clang-format
      
      * Update include paths for batched gemm reduce
      
      * Refactor UnitTest ConvNDBwdWeight.
      
      * Refactor fwd and bwd data convND UT.
      
      * Fix used test macro.
      
      * Fix include path.
      
      * Fix include paths.
      
      * Fix include paths in profiler and tests.
      
      * Fix include paths.
      Co-authored-by: default avatarAdam Osewski <aosewski@amd.com>
      30480288
  4. 23 Aug, 2022 1 commit
    • Po Yen Chen's avatar
      Add examples of Gemm (data type: int4) (#367) · fa2d894b
      Po Yen Chen authored
      * Add GEMM examples for int4
      
      Currently the source files are just copied from int8 examples
      
      * Re-use pre-defined alias in int4 exmples
      
      * Distinguish user-side type from kernel-side type
      
      * Add int4_t support for check_err()
      
      * Allow conversion between Tensor<> specializations
      
      * Re-format source files
      
      * Use different type for host tensors
      
      * Re-use CopyAsType<>() to implement copy ctor
      
      * Re-use element-wise operation type alias
      
      * Fix typo in alias names
      
      * Complete the int4 examples
      
      * Add constraint to Tensor<> templated methods
      
      * Add type traits 'is_signed_integral<>'
      
      * Add type constraints for integer version check_err<>()
      
      * Allow comparing different-sized integral types in check_err()
      
      * Check converted Tensor<int4_t> with golden Tensor<int8_t>
      
      * Remove constraint of Tensor<>::CopyAsType()
      
      * Avoid compilation error while disabling ck::int4_t support
      
      * Remove debug messages
      
      * Add #error directive to prevent compile sources with wrong setting
      
      * Simplify tensor usages in examples
      
      * Add constraint to check_err() input reference type
      
      * Align design with other PR
      
      * Use ""_uz to simplify example code
      
      * Avoid too much generalizing check_err()
      
      * Re-format GEMM instance template arguments
      
      * Extract int4 example common codes
      
      * Sort include directives
      
      * Move #include directives into new header
      
      * Move common codes together
      
      * Re-format template argument in example code
      
      * Reuse same implementation code for most of GEMM examples
      
      * Re-format common.hpp
      
      * Unify structured comment in examples
      
      * Use reinterpret_cast<>() for cross-type pointer conversion
      
      * Revert "Add type traits 'is_signed_integral<>'"
      
      This reverts commit f2c148efaedf42c8ee66032dac6d13a1003b0f3a.
      
      * Allow unsigned integer arguments for check_err()
      
      * Fix compilation error in check_err()
      
      * Remove unnecessary copy ctor for Tensor<>
      
      * Mark Tensor<> special member functions as 'default'
      
      * Use more strict condition to add code in examples
      
      * Fix wrong program return value of GEMM examples
      
      * Handle the case while user specify all the strides
      
      * Fix never-ran examples
      
      * Exit successfully if GEMM instance does not support given problem
      
      * Add missing 'else' keyword
      
      * Re-format CMakeLists.txt
      
      * Add wrapper function to hide value conversion while copying memory
      
      * Add new DeviceMem API to copy memory
      
      * Use new DeviceMem API to implement examples
      
      * Revert "Add new DeviceMem API to copy memory"
      
      This reverts commit 3f190b0779ceedf7aaf0b380712fda0518de72c1.
      
      * Add conversion ctor for Tensor<>
      
      * Write Tensor<> conversion logics explicitly in example code
      
      * Convert Tensor<> values after transfer data to host
      fa2d894b