"...git@developer.sourcefind.cn:OpenDAS/torch-spline-conv.git" did not exist on "0a67b0f7d1e97661033c8599d07c6809ddee3d93"
MixtralFlashAttention2: put "plus 1" inside parentheses when calculating...
MixtralFlashAttention2: put "plus 1" inside parentheses when calculating rotary_seq_len, allowing None position_ids input. (#31500) * Mixtral: remove unnecessary plus 1 when calculating rotary_seq_len, allowing position_ids=None (no auto position_ids generation could be unsafe) * fix typo [:-1] to [:, -1] * to meet formatting requirement * to meet formatting requirement * remove white space * MixtralFlashAttention2: put "+ 1" inside parentheses when calculating rotary_seq_len, allowing None position_ids input. Fix format/style issue. * propagate to startcoder2, phi3, mixtral and qwen2 * update qwen2_moe
Showing
Please register or sign in to comment