• Yu Cheng's avatar
    [Dev] Update benchmark and decoding scripts to refine condition checks and... · e937faa6
    Yu Cheng authored
    [Dev] Update benchmark and decoding scripts to refine condition checks and optimize tensor operations (#637)
    
    - Enhanced the condition in `compare_ab` to ensure baseline checks align with target exclusions.
    - Removed unnecessary tensor allocation in `mla_decode_tilelang`, optimizing memory usage and improving performance by directly using shared tensors in GEMM operations.
    e937faa6
benchmark_mla.py 20.8 KB