-
Zeyu WANG authored
* Add more GPU architctures support * Merge fmha and mla runner * add varlen & non varlen support, and add incontiguous tensor support * update readme * add varlen api --------- Co-authored-by:dianzhangc <dianzhangc@nvidia.com>
41b611f7