• Lei Wang's avatar
    [Dev] Support FP8 Codegen for cuda backend (#64) · 61de5288
    Lei Wang authored
    * [Enhancement] Add VectorizeLoop function and update imports for compatibility
    
    * [CI][Test] Improve test cases for vectorization and fix typos in parser comments
    
    * lint fix
    
    * Fix incorrect module reference for VectorizeLoop transformation
    
    * Refactor vectorize_loop transformation by removing unused extent mutation logic
    
    * [Enhancement] Add support for FP8 data types and global barriers in CUDA codegen
    
    * Fix formatting in CUDA FP8 header file for consistency
    
    * Refactor CI workflow to use 'tilelang_ci' virtual environment and update CUDA type printing for better clarity
    
    * Update submodule 'tvm' to latest commit for improved functionality
    
    * Refactor execution backend references from 'dl_pack' to 'dlpack' for consistency and clarity; add apply_simplify function to simplify PrimFunc or IRModule.
    
    * Refactor CUDA code for improved readability; clean up formatting and remove unnecessary whitespace in multiple files.
    
    * Refactor import statement in test_tilelang_kernel_dequantize_gemm.py to use 'tilelang.language' for consistency
    
    * Add CUDA requirements to FP8 test cases and update references for clarity
    
    * Add a blank line for improved readability in test_tilelang_kernel_fp8_gemm_mma.py
    
    * Fix data type in reference result calculation for consistency in test_tilelang_kernel_gemm_mma_intrinsic.py
    
    * Add CUDA requirements and FP8 test cases for matmul and gemv simulations
    
    * Remove debug print statements and use tilelang's testing assertion for result validation in test_tilelang_kernel_gemm_mma_intrinsic.py
    
    * Remove outdated comment regarding FP8 tests in test_tilelang_kernel_gemv_simt.py
    61de5288
ci.yml 1.66 KB