Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
SIYIXNI
vllm
Commits
3a765bd5
"profiler/vscode:/vscode.git/clone" did not exist on "aba0880dd5badf8411b503e24e4c134fc6125fbe"
Unverified
Commit
3a765bd5
authored
Dec 17, 2023
by
Woosuk Kwon
Committed by
GitHub
Dec 17, 2023
Browse files
Temporarily enforce eager mode for GPTQ models (#2154)
parent
26c52a5e
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
0 deletions
+5
-0
vllm/config.py
vllm/config.py
+5
-0
No files found.
vllm/config.py
View file @
3a765bd5
...
@@ -185,6 +185,11 @@ class ModelConfig:
...
@@ -185,6 +185,11 @@ class ModelConfig:
self
.
max_context_len_to_capture
=
self
.
max_model_len
self
.
max_context_len_to_capture
=
self
.
max_model_len
self
.
max_context_len_to_capture
=
min
(
self
.
max_context_len_to_capture
,
self
.
max_context_len_to_capture
=
min
(
self
.
max_context_len_to_capture
,
self
.
max_model_len
)
self
.
max_model_len
)
if
self
.
quantization
==
"gptq"
and
not
self
.
enforce_eager
:
# Related issue: https://github.com/vllm-project/vllm/issues/2147
logger
.
warning
(
"GPTQ does not support CUDA graph yet. Disabling "
"CUDA graph."
)
self
.
enforce_eager
=
True
def
verify_with_parallel_config
(
def
verify_with_parallel_config
(
self
,
self
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment