Unverified Commit 1210e4d9 authored by Alexander Matveev's avatar Alexander Matveev Committed by GitHub
Browse files

[Bugfix] [B200] cutlass_mla - ensure kv_split == 1 for batch size > 1 (#25509)


Signed-off-by: default avatarAlexander Matveev <amatveev@redhat.com>
parent e0b24ea0
...@@ -135,10 +135,10 @@ public: ...@@ -135,10 +135,10 @@ public:
max_splits = min(16, max_splits); max_splits = min(16, max_splits);
// TODO: This avoids a hang when the batch size larger than 1 and // TODO: This avoids a hang when the batch size larger than 1 and
// there is more than 4 kv_splits. // there is more than 1 kv_splits.
// Discuss with NVIDIA how this can be fixed. // Discuss with NVIDIA how this can be fixed.
if (B > 1) { if (B > 1) {
max_splits = min(2, max_splits); max_splits = min(1, max_splits);
} }
// printf(" max_splits = %d\n", max_splits); // printf(" max_splits = %d\n", max_splits);
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment