Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
88fcf00d
Unverified
Commit
88fcf00d
authored
Apr 29, 2025
by
Huy Do
Committed by
GitHub
Apr 29, 2025
Browse files
Fix some speculative decode tests with tl.dot (#17371)
Signed-off-by:
Huy Do
<
huydhn@gmail.com
>
parent
d1f569b1
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
6 deletions
+3
-6
tests/spec_decode/e2e/test_multistep_correctness.py
tests/spec_decode/e2e/test_multistep_correctness.py
+3
-6
No files found.
tests/spec_decode/e2e/test_multistep_correctness.py
View file @
88fcf00d
...
...
@@ -456,7 +456,7 @@ def test_spec_decode_e2e_greedy_correctness_real_model_large_bs(
@
pytest
.
mark
.
parametrize
(
"common_llm_kwargs"
,
[{
"block_size"
:
8
,
"block_size"
:
16
,
# 2 for small prompt, 256//8 for generated.
"num_gpu_blocks_override"
:
2
+
256
//
8
,
"max_model_len"
:
(
2
+
256
//
8
)
*
8
,
...
...
@@ -526,11 +526,8 @@ def test_spec_decode_e2e_greedy_correctness_with_preemption(
@
pytest
.
mark
.
parametrize
(
"per_test_common_llm_kwargs"
,
[
# As of this writing, vLLM only compiles with these 3 block sizes by
# default.
{
"block_size"
:
8
,
},
# https://github.com/triton-lang/triton/issues/2266 tl.dot
# doesn't support embedding < 16
{
"block_size"
:
16
,
},
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment