- 15 Aug, 2025 1 commit
-
-
Gabriel Wu authored
* chore: fix typos * chore: fix ruff * chore: fix clang-format
-
- 08 Jul, 2025 1 commit
-
-
Lei Wang authored
* [Refactor] Update tilelang kernel functions and remove unused imports - Refactored the `flashattn_fwd`, `flashattn_bwd_preprocess`, and `flashattn_bwd_postprocess` functions to utilize direct kernel calls instead of cached versions, improving clarity and performance. - Added `@tilelang.jit` decorators with specified output indices to enhance kernel compilation. - Removed unused import of `cached` from `tilelang`, streamlining the code. - Commented out the main testing function call in `test_tilelang_kernel_mha_bwd.py` for potential future use. * [Refactor] Simplify configuration generation in benchmark and example scripts - Refactored the `get_configs` functions in multiple benchmark and example scripts to utilize a dictionary-based approach for parameter configuration, improving readability and maintainability. - Updated the `flashattn` and `chunk_scan_fwd` functions to directly accept configuration parameters, enhancing flexibility in kernel tuning. - Removed redundant code and streamlined the configuration generation process across various files, ensuring consistency in how configurations are defined and utilized. * [Refactor] Update configuration handling in benchmark scripts - Refactored the `get_configs` functions in benchmark scripts to accept a variable argument list, improving flexibility in configuration management. - Enhanced the `matmul` and `flashattn` functions to utilize the updated configuration approach, streamlining parameter handling for kernel tuning. - Added `@autotune` decorators to relevant functions, ensuring consistent autotuning behavior across benchmarks. - Cleaned up redundant code and improved overall readability in the affected files. * [Refactor] Clean up formatting and update subproject commit - Updated the subproject commit reference in the TVM directory to indicate a dirty state. - Removed unnecessary blank lines and improved formatting in the `benchmark_matmul` and `benchmark_matmul_fp8` scripts for better readability. - Streamlined the function definitions in the `flashattn` example script to enhance clarity and maintainability. * [Refactor] Update AutoTuner configuration handling - Modified the AutoTuner class to check if kernel parameters are set before processing tunable arguments, improving robustness in configuration handling. - Enhanced the logic for skipping compilation when tunable parameters are already provided, ensuring efficient use of resources. - Updated comments for clarity and maintainability. * lint fix * Update TVM subproject commit to indicate dirty state and modify MHA backward test cases - Updated the subproject commit reference in the TVM directory to reflect a dirty state. - Adjusted the `test_mha_bwd` function to use a new configuration for the MHA backward tests, changing the context size from 128 to 256. - Uncommented the main testing function call for potential execution.
-
- 05 Apr, 2025 1 commit
-
-
yeh-sudo authored
This pull request includes a change to the `gemv.md` file. The changes add heading level to title in the document to make the heading level right.
-
- 28 Mar, 2025 1 commit
-
-
botbw authored
* [doc/example] init gemv doc and examples * [example] add vectorized read * [example] use local register instead of smem * [example] add bench * [doc] update doc * [doc] refine doc * [lint] format code * [doc] add tips * [doc/example] fix typo * [example] use tmv_all_reduce * [doc] update doc accordingly * [doc] add benchmark table * [lint] format code
-
- 13 Feb, 2025 1 commit
-
-
Wenhao Xie authored
* [CI] Clean up target repository before publishing documentation. * [Doc] Convert docs from rst format to Markdown format.
-
- 26 Jan, 2025 1 commit
-
-
Lei Wang authored
* implement jit test case * [Dev] implement auto tune test case for matrix multiplication * Implement test for legalize memory access and vectorized loop * lint fix * introduce run_once * Refactor callback function names for consistency and improve code readability * enhance documentations * lint fix * lint fix * lint fix * lint fix * fix formatting issues in rt_mod_hip.cc * add random seed initialization for deterministic testing
-