"examples/vscode:/vscode.git/clone" did not exist on "77103d71ca0ee1233de082987e21f83f7f4c3a07"
  • Lei Wang's avatar
    [Enhancement] Allow mma fallback when wgmma is not supported (#206) · 45559a1f
    Lei Wang authored
    * Enhance error message for constant size stack allocation in CUDA codegen. Include the actual constant size and buffer variable name in the error output for better debugging.
    
    * Refactor GEMM and Bulk Copy operations to enhance layout handling and support for Hopper architecture
    
    - Update `ComputeWarpPartition` to include a new parameter for Hopper WGMMA support.
    - Modify layout checks in `LowerBulkCopy` to accommodate new GEMM layout types.
    - Enhance layout inference logic in `InferLayout` for better compatibility with Hopper architecture.
    - Include necessary header files for built-in operations and layout inference improvements.
    
    * lint fix
    
    * Remove unused builtin.h include directive
    
    * Update include path for builtin.h
    45559a1f
codegen_cuda.cc 60.3 KB