Unverified Commit 1d1b1efa authored by OlivierDehaene's avatar OlivierDehaene Committed by GitHub
Browse files

fix(server): fix cohere (#2249)

parent da82c63a
...@@ -259,8 +259,8 @@ class FlashCohereAttention(torch.nn.Module): ...@@ -259,8 +259,8 @@ class FlashCohereAttention(torch.nn.Module):
cu_seqlen_prefill, cu_seqlen_prefill,
kv_cache, kv_cache,
block_tables, block_tables,
input_lengths,
slots, slots,
input_lengths,
max_s, max_s,
): ):
qkv = self.query_key_value(hidden_states) qkv = self.query_key_value(hidden_states)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment