"git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "4b3e55bdccdd87c4e283dd2261b9b57242597c9a"
Fix sliding window attention used in Gemma2FlashAttention2 (#32522)
* fix sliding window attention (flash2) in gemma2 model * [run-slow] gemma * fix slicing attention_mask for flash_attn2 * fix slicing attention_mask when flash_attn is used * add missing comment * slice the last seq_len tokens in the key, value states * revert code of slicing key, value states
Showing
Please register or sign in to comment