• Lei Wang's avatar
    [Enhancement] Support tf32 gemm_rs (#607) · 0ff81755
    Lei Wang authored
    - Added a line break in `quickstart.py` for better readability.
    - Simplified the JIT kernel compilation in `quickstart.py` by removing the unused execution backend option.
    - Modified `example_elementwise_add.py` to disable cache for `tilelang` and optimized the element-wise addition kernel by utilizing shared memory for input tensors, improving performance.
    - Updated default values for matrix dimensions and block sizes in the argument parser to enhance usability.
    0ff81755
gemm_layouts.cc 25.3 KB