examples/quickstart.py · bf824406369a0f522475e64bfed82aeca7076d4b · OpenDAS / tilelang

"examples/vscode:/vscode.git/clone" did not exist on "a7730272e4aeeed198b855b7f36ef7ac88cdd76b"

[Bugfix] Support larger than 256 box size tma copy (#413) · bf824406

Lei Wang authored Apr 21, 2025

* [New Feature] Add FP8 Flash Attention Implementation (#412)

* Introduce a new example script for FP8 Flash Attention in `example_mla_decode_kv_fp8.py`, showcasing the use of tilelang for efficient attention computation.
* Implement the `flashattn` function with optimized memory management and kernel execution.
* Include a reference program for comparison and performance evaluation.
* Add command-line argument parsing for batch size, number of heads, and dimensions to facilitate testing and experimentation.
* Enhance the overall structure and readability of the code.

This addition aims to improve the performance of attention mechanisms in deep learning models by leveraging FP8 precision and optimized kernel execution.

* lint fix

* optimize quick start

* lint fix

bf824406

quickstart.py 3.74 KB

Replace quickstart.py