-
徐畅 authored
* [CI] Add flash_decoding example to CI * Add output of ref latency * format example_gqa_decode.py * [BugFix] Fix precision issue in GQA decode when block_N exceeds seqlen/num_split * format example_gqa_decode.py
67d0b677
* [CI] Add flash_decoding example to CI * Add output of ref latency * format example_gqa_decode.py * [BugFix] Fix precision issue in GQA decode when block_N exceeds seqlen/num_split * format example_gqa_decode.py