1. 15 Jan, 2025 1 commit
    • Andriy Roshchenko's avatar
      MX FP GEMM - Example Template (#277) · 07307ea1
      Andriy Roshchenko authored
      Temporarily uses `DeviceGemmMultiD_ABScale_Xdl_CShuffle_V3` kernel and 128x128 scaling matrices.
      Must be modified to use MX-native GEMM kernell with 16 or 32 component vectors per scale.
      
      Verified on the emulator.
      07307ea1
  2. 07 Jan, 2025 2 commits
    • Andriy Roshchenko's avatar
      [MX FP8] Add Scaled Type Convert Functions for OCP FP8/BF8 data types (#271) · c4a05057
      Andriy Roshchenko authored
      * Move scaled_type_convert functions to a separate header
      
      * Introduce MX data tests
      
      * Build MX tests only on relevant architectures
      
      * Refactor E8M0 scale implementation
      
      * Fix `config.h` typo
      
      * Cleanup deprecated symbols
      
      * Refactor `amd_ck_fp8.hpp`
      
      * `scaled_type_convert` for `f8_ocp_t`
      
      * Implement test for MX FP8 scaled type convert
      
      * Implement test for MX BF8 scaled type convert
      
      * Scaled type convert for vectors of 2 FP8 elements
      
      * Scaled type convert for vectors of 16 FP8 elements
      
      * Implementation of scaled conversion from F32 to F8
      
      * Add tests for scaled conversions from FP32 to FP8
      
      * Add documentation to the test functions
      
      * Implementation of scaled conversion from F32x2 to F8x2
      
      * Implementation of scaled conversion from F32x16 to F8x16
      
      * Implementation of scaled conversion from F32x32 to F8x32
      
      * Implementation of scaled conversion from F8x32 to F32x32
      
      * Verified on the emulator
      c4a05057
    • illsilin's avatar
      enable smfmac test · 23e2309d
      illsilin authored
      23e2309d
  3. 06 Jan, 2025 3 commits
  4. 19 Dec, 2024 1 commit
  5. 18 Dec, 2024 6 commits
  6. 17 Dec, 2024 6 commits
  7. 16 Dec, 2024 5 commits
  8. 15 Dec, 2024 1 commit
  9. 14 Dec, 2024 2 commits
  10. 13 Dec, 2024 5 commits
  11. 12 Dec, 2024 7 commits
  12. 11 Dec, 2024 1 commit