Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
924c46d4
Unverified
Commit
924c46d4
authored
Jun 03, 2024
by
Younes Belkada
Committed by
GitHub
Jun 03, 2024
Browse files
Cohere: Fix copied from (#31213)
Update modeling_cohere.py
parent
98dd8423
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
1 deletion
+2
-1
src/transformers/models/cohere/modeling_cohere.py
src/transformers/models/cohere/modeling_cohere.py
+2
-1
No files found.
src/transformers/models/cohere/modeling_cohere.py
View file @
924c46d4
...
...
@@ -310,7 +310,7 @@ class CohereAttention(nn.Module):
return
attn_output
,
attn_weights
,
past_key_value
# Copied from transformers.models.llama.modeling_llama.LlamaFlashAttention2 Llama->Cohere
# Copied from transformers.models.llama.modeling_llama.LlamaFlashAttention2
with
Llama->Cohere
class
CohereFlashAttention2
(
CohereAttention
):
"""
Cohere flash attention module. This module inherits from `CohereAttention` as the weights of the module stays
...
...
@@ -326,6 +326,7 @@ class CohereFlashAttention2(CohereAttention):
# Beware that with flash_attn<2.1, using q_seqlen != k_seqlen (except for the case q_seqlen == 1) produces a wrong mask (top-left).
self
.
_flash_attn_uses_top_left_mask
=
not
is_flash_attn_greater_or_equal_2_10
()
# Ignore copy
def
forward
(
self
,
hidden_states
:
torch
.
Tensor
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment