• Lei Wang's avatar
    [Bugfix] Fix `T.copy` for scalar datatypes (#190) · 454248c7
    Lei Wang authored
    * Optimize CMake build process with dynamic job count calculation
    
    - Modify build_csrc function to use 90% of available CPU cores
    - Ensure at least one job is used during compilation
    - Improve build performance by dynamically adjusting parallel job count
    
    * Optimize build_csrc function with multiprocessing module
    
    - Replace os.cpu_count() with multiprocessing.cpu_count()
    - Maintain existing 90% CPU utilization logic
    - Improve CPU core count calculation for build process
    
    * Add dynamic shape support with out_idx in Cython JIT kernel compilation
    
    - Implement `run_cython_dynamic_shape_with_out_idx` function in test_tilelang_jit_gemm_cython.py
    - Update Cython wrapper to handle dynamic symbolic shapes during tensor allocation
    - Add support for resolving dynamic shape dimensions using input tensor references
    - Enhance flexibility of JIT kernel compilation with symbolic shape handling
    
    * Enhance error reporting for dynamic symbolic shape resolution in Cython JIT kernel
    
    - Add detailed error message when a dynamic symbolic dimension is not found in dynamic_symbolic_map
    - Improve debugging by providing context about missing symbolic dimensions
    - Maintain existing dynamic shape resolution logic
    
    * Fix Copy operation handling for scalar and multi-dimensional tensors
    
    - Add special handling for scalar tensor copy operations
    - Enhance error reporting in MakeIndices method with more detailed diagnostic information
    - Improve SIMT loop generation to support zero-dimensional tensors
    - Add explicit check and handling for scalar tensor scenarios
    
    * Refactor Copy operation code formatting and improve readability
    
    - Improve code formatting in MakeIndices and MakeSIMTLoop methods
    - Add line breaks to enhance readability of complex ICHECK statements
    - Simplify code structure in scalar tensor handling
    - Remove unnecessary whitespace and improve code alignment
    454248c7
elem.cc 14.1 KB