[Example] Add sparse gqa decode example (#332)
* add example gqa decode wgmma pipelined
* add sparse gqa
* support num split
* support num split
* add if condition
* add heuristic num split
* clean code
* add ref
* fix bug
* add torch ref
* fix bug
* integrate to torch
* symbolic
* clean mask
* rm actual_num_blocks
* clean code
* get num_sm via torch
* add sparse gqa decode example
* format
* rm example_gqa_decode_wgmma_pipelined.py
* Add license headers to example scripts
* format
* Remove commented-out cache disabling lines
---------
Co-authored-by:
Lei Wang <34334180+LeiWang1999@users.noreply.github.com>
Showing
This diff is collapsed.
This diff is collapsed.
Please register or sign in to comment