• Yu Cheng's avatar
    [Enhancement] Add zero initialization option to GEMM operations (#246) · 701e9234
    Yu Cheng authored
    * [Enhancement] Add zero initialization option to GEMM operations
    
    - Introduced a new `zero_init` parameter to the GEMM function, allowing for optional zero initialization of the accumulator.
    - Updated the GEMM implementation across various CUDA architectures to support the new parameter.
    - Modified the Python interface for GEMM to include the `zero_init` argument, enhancing flexibility in kernel execution.
    - Ensured compatibility with existing functionality while improving initialization control for performance optimization.
    
    * rename zero_init to clear_accum
    
    * lint
    701e9234
gemm.h 1.12 KB