Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
195d1ca3
Unverified
Commit
195d1ca3
authored
Mar 10, 2026
by
Woosuk Kwon
Committed by
GitHub
Mar 10, 2026
Browse files
[Minor] Enhance error message for TRTLLM decode uniformity check (#36609)
Signed-off-by:
Woosuk Kwon
<
woosuk@inferact.ai
>
parent
8d983d7c
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
1 deletion
+2
-1
vllm/v1/attention/backends/flashinfer.py
vllm/v1/attention/backends/flashinfer.py
+2
-1
No files found.
vllm/v1/attention/backends/flashinfer.py
View file @
195d1ca3
...
@@ -1110,7 +1110,8 @@ class FlashInferMetadataBuilder(AttentionMetadataBuilder[FlashInferMetadata]):
...
@@ -1110,7 +1110,8 @@ class FlashInferMetadataBuilder(AttentionMetadataBuilder[FlashInferMetadata]):
if
num_decodes
>
0
:
if
num_decodes
>
0
:
if
decode_use_trtllm
:
if
decode_use_trtllm
:
assert
num_decode_tokens
%
num_decodes
==
0
,
(
assert
num_decode_tokens
%
num_decodes
==
0
,
(
"TRTLLM decode requires uniform query lengths per request."
"TRTLLM decode requires uniform query lengths per request. "
f
"Got
{
num_decode_tokens
=
}
and
{
num_decodes
=
}
."
)
)
attn_metadata
.
decode
=
TRTLLMDecode
(
attn_metadata
.
decode
=
TRTLLMDecode
(
block_tables
=
block_table_tensor
[:
num_decodes
],
block_tables
=
block_table_tensor
[:
num_decodes
],
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment