-
Yu Cheng authored
* [Enhancement] Add zero initialization option to GEMM operations - Introduced a new `zero_init` parameter to the GEMM function, allowing for optional zero initialization of the accumulator. - Updated the GEMM implementation across various CUDA architectures to support the new parameter. - Modified the Python interface for GEMM to include the `zero_init` argument, enhancing flexibility in kernel execution. - Ensured compatibility with existing functionality while improving initialization control for performance optimization. * rename zero_init to clear_accum * lint
701e9234