• Lei Wang's avatar
    [Bugfix] Fix the test data distribution of cumsum (#432) · 3d206235
    Lei Wang authored
    * [Refactor] Adjust layout inference calculations in Gemm and ParallelOp
    
    * Updated block size calculation in Gemm to account for the range of thread bounds, improving accuracy in layout inference.
    * Simplified layout conflict error messages in ParallelOp for better clarity, enhancing debugging experience.
    * Removed redundant buffer checks in ParallelOp layout inference logic, streamlining the code.
    
    * [Refactor] Clean up layout inference logic in Gemm and ParallelOp
    
    * Removed unnecessary warning log in Gemm related to WGMMA conditions, streamlining the layout inference process.
    * Commented out redundant checks in ParallelOp's layout inference, improving code clarity while maintaining functionality.
    * Enhanced error messages in ParallelOp to provide clearer context for layout conflicts, aiding in debugging efforts.
    
    * lint fix
    
    * [Enhancement] Improve cumulative sum functionality and annotations handling
    
    * Updated the `cumsum` function to include detailed documentation and error handling for dimension bounds.
    * Modified the `run_cumsum` test to utilize a random tensor supply type for profiling, enhancing test robustness.
    * Added annotations to the fused loop in `loop_fusion_utils.h`, ensuring proper metadata is preserved during loop fusion.
    
    * lint fix
    3d206235
test_tilelang_language_cumsum.py 3.37 KB