"LiveCD/git@developer.sourcefind.cn:dadigang/Ventoy.git" did not exist on "a287bf89079f32172dc357727405db19084247f7"
MixtralFlashAttention2: put "plus 1" inside parentheses when calculating...
MixtralFlashAttention2: put "plus 1" inside parentheses when calculating rotary_seq_len, allowing None position_ids input. (#31500) * Mixtral: remove unnecessary plus 1 when calculating rotary_seq_len, allowing position_ids=None (no auto position_ids generation could be unsafe) * fix typo [:-1] to [:, -1] * to meet formatting requirement * to meet formatting requirement * remove white space * MixtralFlashAttention2: put "+ 1" inside parentheses when calculating rotary_seq_len, allowing None position_ids input. Fix format/style issue. * propagate to startcoder2, phi3, mixtral and qwen2 * update qwen2_moe
Showing
Please register or sign in to comment