• Lei Wang's avatar
    [Enhancement] Add support for CUDA architecture 8.9 in GEMM template (#304) · edbb9b6d
    Lei Wang authored
    * [Enhancement] Add support for CUDA architecture 8.9 in GEMM template
    
    - Introduced conditional inclusion of "gemm_sm89.h" for CUDA architectures 8.9 and above, enhancing compatibility with newer hardware.
    - This change ensures that the GEMM template can leverage optimizations specific to the 8.9 architecture, improving performance for users with compatible GPUs.
    
    * lintfix
    
    * [Refactor] Clean up includes in gemm_sm89.h
    
    - Removed duplicate inclusion of "common.h" and added "cuda_fp8.h" for improved clarity and organization.
    - This change enhances the maintainability of the code by ensuring that header files are included only once and in a logical order.
    edbb9b6d
gemm.h 385 Bytes