"docs/source/git@developer.sourcefind.cn:OpenDAS/nni.git" did not exist on "632df3ea2f99f3c8e4d2a16fab6ebe4303609da1"
[Enhancement] Keep max score attention across blocks in FlashAttention for...
[Enhancement] Keep max score attention across blocks in FlashAttention for better numerical stablity (#1269) * Implement max score retention across blocks in FlashAttention for improved stability * fix manual pipeline parameters * Update examples/flash_attention/example_gqa_fwd_varlen.py Co-authored-by:coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * fix typo * more * fix a previous typo --------- Co-authored-by:
coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Showing
Please register or sign in to comment