• Lei Wang's avatar
    [CI] Update CI configuration to run pytest with automatic parallelization (#393) · 6d3d4743
    Lei Wang authored
    * Update CI configuration to run pytest with automatic parallelization using the '-n auto' option.
    
    * Enhance Cython JIT Adapter Compilation Logic
    
    - Improved the locking mechanism during the compilation of the Cython JIT adapter to prevent race conditions.
    - Added checks to determine if another process has already compiled the library, reducing unnecessary recompilation.
    - Cleaned up the code by removing redundant imports and ensuring proper handling of temporary files during compilation failures.
    - Updated vectorization logic in loop_vectorize.cc to allow optional simplification of vectorized expressions.
    
    This update enhances performance and reliability in the JIT compilation process.
    
    * lint fix
    
    * Update CI configuration to run pytest with 4 parallel jobs instead of auto-detection
    
    * Add pytest markers for serial execution in MHA tests
    
    - Added @pytest.mark.serial to multiple MHA test functions to ensure they run sequentially.
    - This change improves test reliability by preventing potential race conditions during execution.
    
    * Update TVM submodule and enhance vectorization logic in loop_vectorize.cc
    
    - Updated the TVM submodule to the latest commit.
    - Modified the vectorization logic to include optional simplification of vectorized expressions and added checks to ensure the usage of vectorized variables, improving performance and reliability in expression handling.
    
    * Remove @pytest.mark.serial from multiple MHA test functions to allow parallel execution. This change enhances test performance by enabling concurrent test runs while maintaining reliability.
    
    * Remove tvm_simplify_test.py file, eliminating the test for expression simplification in TVM. This cleanup helps streamline the codebase by removing unused test cases.
    
    * Remove unused pytest import from test_tilelang_kernel_mha.py to streamline the test file.
    
    * lint fix
    
    * Update TVM submodule and refine vectorization logic in loop_vectorize.cc
    
    - Updated the TVM submodule to the latest commit.
    - Adjusted the return statements in loop_vectorize.cc to improve expression handling and ensure consistency in the visitor pattern.
    
    * Refactor vectorization logic in loop_vectorize.cc
    
    - Removed the check for the usage of the vectorized variable in the vectorization logic, simplifying the expression handling.
    - This change enhances the clarity and efficiency of the vectorization process.
    
    * Enhance vectorization checks in loop_vectorize.cc
    
    - Added a check to ensure the vectorized expression uses the vectorized variable, improving the robustness of the vectorization logic.
    - This change refines the expression handling and ensures that only valid vectorized expressions are processed.
    
    * Implement non-local buffer checks for loop vectorization in layout_inference.cc
    
    - Added logic to check for non-local buffer loads and stores before applying vectorization to loops. This enhancement ensures that vectorization is only applied when appropriate, improving the correctness of the loop transformations.
    
    * Refactor buffer handling in pipeline planning and layout inference
    
    - Renamed GlobalCopyPatternDetector to BufferRegionCollector for clarity and updated its logic to collect buffer read/write regions.
    - Enhanced the handling of conditional expressions in pipeline planning, allowing for better management of stages related to conditional statements.
    - Improved the processing of buffer regions during read/write operations, ensuring accurate tracking of buffer usage across different stages.
    
    * Refactor vectorization checks in loop_vectorize.cc
    
    - Removed the check for the usage of the vectorized variable in the vectorization logic, simplifying the expression handling.
    - This change enhances the clarity and efficiency of the vectorization process, ensuring that valid vectorized expressions are processed without unnecessary checks.
    6d3d4743
layout_inference.cc 21.6 KB