"src/include/threadwise_direct_convolution.hpp" did not exist on "216e3da60959ee5968d7424ac0943c86fbf55375"
[GQA] Add varlen decoding kernel with logits saving (#1223)
* [Example] Add GQA varlen decoding kernel with logits return * [Example] Support Sink for GQA varlen decoding * [Example] Add for no-varlen support * [Tune] Add high performance logits saving * [Lint] * [Lint] * [Rename]
Showing
This diff is collapsed.
Please register or sign in to comment