• Lei Wang's avatar
    [Bugfix] Support `T.Parallel` with local register assignment (#395) · 8c5b1341
    Lei Wang authored
    * make it python 3.8- happy
    
    * [Enhancement] Improve loop partitioning and vectorization logic in layout inference and loop vectorization
    
    - Enhanced the VisitStmt_ method to support local buffer handling in parallel loops, allowing for register usage without explicit thread binding.
    - Updated loop vectorization logic to simplify expressions and ensure accurate vector size calculations, improving performance and clarity in the vectorization process.
    
    * lint fix
    8c5b1341
loop_vectorize.cc 9.14 KB