• Lei Wang's avatar
    [Enhancement] Support register input for gemm when trans_a or trans_b is true (#490) · d4f096ef
    Lei Wang authored
    * [Refactor] Enhance makeGemmFragmentB to support transposition
    
    * Updated the `makeGemmFragmentB` function to include a `transposed` parameter, allowing for flexible layout generation based on matrix transposition.
    * Adjusted layout calculations for both transposed and non-transposed cases to ensure correct fragment generation.
    * Modified the function signature in `layout.h` and updated all relevant calls in `gemm.cc` to accommodate the new parameter.
    * Added a new `matmul_sr` function in the test suite to validate the behavior of the updated fragment generation with transposition support.
    
    * [Refactor] Enhance makeGemmFragmentA and makeGemmFragmentB for transposition support
    
    * Updated the `makeGemmFragmentA` and `makeGemmFragmentB` functions to include a `transposed` parameter, allowing for flexible layout generation based on matrix transposition.
    * Adjusted layout calculations for both transposed and non-transposed cases to ensure correct fragment generation.
    * Modified function signatures in `layout.h` and updated all relevant calls in `gemm.cc` to accommodate the new parameter.
    * Added a new `matmul_rs` function in the test suite to validate the behavior of the updated fragment generation with transposition support.
    *
    
    * Improve error messaging in layout equality checks
    
    * Enhanced the error output in layout equality checks to provide clearer context by adding line breaks for better readability in the debug output.
    * This change ensures that when layouts are structurally unequal, the current and previous layouts are displayed more distinctly, aiding in debugging.
    d4f096ef
gemm_layouts.cc 20.8 KB