[GQA] Add varlen decoding kernel with logits saving (#1223)
* [Example] Add GQA varlen decoding kernel with logits return * [Example] Support Sink for GQA varlen decoding * [Example] Add for no-varlen support * [Tune] Add high performance logits saving * [Lint] * [Lint] * [Rename]
Showing
Please register or sign in to comment