1. 13 Aug, 2024 2 commits
    • AngryLoki's avatar
      Fix compilation errors with libc++ (#1461) · 50c42348
      AngryLoki authored
      
      
      This fixes 2 issues when compiled with libc++.
      
      First issue is attempt to call std::numeric_limits<ranges::range_value_t<_Float16>>::min().
      _Float16 is extension of libstdc++, it does not exist in C++ standard[2].
      Luckily, there is NumericLimits class in composable_kernel, which does everything needed.
      
      Second issue with call to 'check_err' is ambiguous: there are 2 candidates.
      It happens because composable_kernel relies on idea that f8_t (defined as _BitInt(8)) does not pass is_integral trait.
      However, libc++ treats _BitInt(N) as integral (per standard "any implementation-defined extended integer types" can be integral).
      
      Closes: #1460
      Signed-off-by: default avatarSv. Lockal <lockalsash@gmail.com>
      50c42348
    • Mateusz Ozga's avatar
  2. 12 Aug, 2024 2 commits
  3. 10 Aug, 2024 1 commit
  4. 09 Aug, 2024 2 commits
  5. 08 Aug, 2024 3 commits
  6. 07 Aug, 2024 4 commits
  7. 06 Aug, 2024 7 commits
  8. 05 Aug, 2024 2 commits
  9. 01 Aug, 2024 1 commit
  10. 31 Jul, 2024 4 commits
  11. 30 Jul, 2024 2 commits
  12. 26 Jul, 2024 2 commits
  13. 25 Jul, 2024 2 commits
  14. 24 Jul, 2024 3 commits
  15. 23 Jul, 2024 1 commit
  16. 22 Jul, 2024 1 commit
  17. 19 Jul, 2024 1 commit
    • Haocong WANG's avatar
      [GEMM] F8 GEMM, performance optimized. (#1384) · 8c90f25b
      Haocong WANG authored
      
      
      * add ab_scale init support
      
      * enabled interwave
      
      * add scale type; update isSupport
      
      * adjust example
      
      * clean
      
      * enable f8 pure gemm rcr ckprofiler
      
      * Add gemm_multiply_multiply instances
      
      * clang format
      
      * Optimize for ScaleBlockMNK=128
      
      * enable abscale f8 gemm ck profiler
      
      * Add pure f8 gemm test suite
      
      * Reverting to the state of project at f60fd77
      
      * update copyright
      
      * clang format
      
      * update copyright
      
      ---------
      Co-authored-by: default avatarroot <jizhan@amd.com>
      8c90f25b