Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
8a74c68b
Unverified
Commit
8a74c68b
authored
Jul 17, 2024
by
Cody Yu
Committed by
GitHub
Jul 18, 2024
Browse files
[Misc] Minor patch for draft model runner (#6523)
parent
61e59274
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
2 deletions
+6
-2
vllm/spec_decode/draft_model_runner.py
vllm/spec_decode/draft_model_runner.py
+6
-2
No files found.
vllm/spec_decode/draft_model_runner.py
View file @
8a74c68b
...
...
@@ -15,8 +15,12 @@ from vllm.worker.model_runner import (ModelInputForGPUWithSamplingMetadata,
logger
=
init_logger
(
__name__
)
# A flag to enable debug prints for the updated input tensors
# before each step.
debug_advance_input
=
False
enable_gpu_advance_step
=
True
# A flag to allow GPU advance step for draft model runner.
# Set to False for debugging.
allow_gpu_advance_step
=
True
class
TP1DraftModelRunner
(
ModelRunner
):
...
...
@@ -196,7 +200,7 @@ class TP1DraftModelRunner(ModelRunner):
3. No LORA
4. No prompt_adapter_config
"""
if
not
enable
_gpu_advance_step
:
if
not
allow
_gpu_advance_step
:
return
False
# We allow multi-step GPU only in decode mode
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment