Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
SIYIXNI
vllm
Commits
6f41f0e3
"profiler/vscode:/vscode.git/clone" did not exist on "f0831350d15c3d368d7ae321dd08441d6569086e"
Unverified
Commit
6f41f0e3
authored
Dec 17, 2023
by
Woosuk Kwon
Committed by
GitHub
Dec 17, 2023
Browse files
Disable CUDA graph for SqueezeLLM (#2161)
parent
2c9b6380
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
3 deletions
+4
-3
vllm/config.py
vllm/config.py
+4
-3
No files found.
vllm/config.py
View file @
6f41f0e3
...
...
@@ -185,10 +185,11 @@ class ModelConfig:
self
.
max_context_len_to_capture
=
self
.
max_model_len
self
.
max_context_len_to_capture
=
min
(
self
.
max_context_len_to_capture
,
self
.
max_model_len
)
if
self
.
quantization
==
"gptq"
and
not
self
.
enforce_eager
:
if
(
self
.
quantization
in
[
"gptq"
,
"squeezellm"
]
and
not
self
.
enforce_eager
):
# Related issue: https://github.com/vllm-project/vllm/issues/2147
logger
.
warning
(
"GPTQ
does not support CUDA graph
yet. Disabling
"
"CUDA graph."
)
logger
.
warning
(
f
"
{
self
.
quantization
}
does not support CUDA graph "
"
yet. Disabling
CUDA graph."
)
self
.
enforce_eager
=
True
def
verify_with_parallel_config
(
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment