Add more GPU architctures support (#76)
* Add more GPU architctures support
* Merge fmha and mla runner
* add varlen & non varlen support, and add incontiguous tensor support
* update readme
* add varlen api
---------
Co-authored-by:
dianzhangc <dianzhangc@nvidia.com>
Showing
This diff is collapsed.
This diff is collapsed.
csrc/sm100/common/helper.h
0 → 100644
csrc/sm100/common/mask.cuh
0 → 100644
csrc/sm100/common/pow_2.hpp
0 → 100644
csrc/sm100/common/utils.hpp
0 → 100644
csrc/sm100/device/fmha.hpp
0 → 100644
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Please register or sign in to comment