"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "96881729ce83cfc8e5fa04c903ee4296ad17cfbb"
[`Mistral`] Add Flash Attention-2 support for `mistral` (#26464)
* add FA-2 support for mistral * fixup * add sliding windows * fixing few nits * v1 slicing cache - logits do not match * add comment * fix bugs * more mem efficient * add warning once * add warning once * oops * fixup * more comments * copy * add safety checker * fixup * Update src/transformers/models/mistral/modeling_mistral.py Co-authored-by:Arthur <48595927+ArthurZucker@users.noreply.github.com> * copied from * up * raise when padding side is right * fixup * add doc + few minor changes * fixup --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
Showing
Please register or sign in to comment