"include/vscode:/vscode.git/clone" did not exist on "5aa3c3440910caa6b16a65cf2c8e22d233b35717"
  • Lei Wang's avatar
    [AMD] Support float8 matrix core (#537) · 5872e647
    Lei Wang authored
    
    
    * [Enhancement] Add support for FP8 types in CUDA and HIP code generation
    
    * Updated `GetFP8Type` function in `codegen_cuda.cc` and `codegen_hip.cc` to handle new FP8 types, including `kFloat8_e4m3fnuz`.
    * Introduced a new header file `hip_fp8.h` for FP8 type definitions in HIP.
    * Modified type mappings in `dlpack.py` and `mfma_macro_generator.py` to accommodate new FP8 types.
    * Enhanced type handling in `TLHIPSourceWrapper` and `tensor.py` for better integration with FP8 types.
    * Added necessary includes and logic to support FP8 in the code generation process, improving performance and compatibility with FP8 data types.
    
    * lint fix
    
    * Update src/target/codegen_hip.cc
    Co-authored-by: default avatargemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
    
    * Update tilelang/intrinsics/mfma_macro_generator.py
    Co-authored-by: default avatargemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
    
    * workaround
    
    * fix
    
    * Update submodule TVM to latest commit 587028ffebfff0ded520f8f90d62f0f6b165906c
    
    * bug fix
    
    * Refactor tilelang matrix multiplication to support transposition and packing options. Adjusted shared memory shapes and loading logic for A and B matrices. Updated test cases to validate new functionality.
    
    * Refactor assertion function for tilelang matrix multiplication to improve readability by formatting parameters and aligning code. Cleaned up whitespace in intrinsic layout functions for consistency.
    
    * Update bfloat16 type definitions in common.h and gemm.h for consistency. Changed __hip_bfloat16 to hip_bfloat16 and updated MfmaTraits specialization accordingly.
    
    * lint fix
    
    ---------
    Co-authored-by: default avatargemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
    5872e647
codegen_cuda.cc 62.6 KB