Add more GPU architctures support (#76)
* Add more GPU architctures support
* Merge fmha and mla runner
* add varlen & non varlen support, and add incontiguous tensor support
* update readme
* add varlen api
---------
Co-authored-by:
dianzhangc <dianzhangc@nvidia.com>
Showing
tests/test_fmha_sm100.py
0 → 100644
Please register or sign in to comment