"src/vscode:/vscode.git/clone" did not exist on "d8d14fbfbaa49749cc0302e7dc1e01ae247c479f"
[Enhancement] Keep max score attention across blocks in FlashAttention for...
[Enhancement] Keep max score attention across blocks in FlashAttention for better numerical stablity (#1269) * Implement max score retention across blocks in FlashAttention for improved stability * fix manual pipeline parameters * Update examples/flash_attention/example_gqa_fwd_varlen.py Co-authored-by:coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * fix typo * more * fix a previous typo --------- Co-authored-by:
coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Showing
Please register or sign in to comment