Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
SIYIXNI
vllm
Commits
21d5daa4
"git@developer.sourcefind.cn:cnjsdfcy/simbricks.git" did not exist on "b2696fc5d11073d1ee3da051799b06f015017b8c"
Unverified
Commit
21d5daa4
authored
Dec 18, 2023
by
Woosuk Kwon
Committed by
GitHub
Dec 18, 2023
Browse files
Add warning on CUDA graph memory usage (#2182)
parent
290e015c
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
0 deletions
+3
-0
vllm/worker/model_runner.py
vllm/worker/model_runner.py
+3
-0
No files found.
vllm/worker/model_runner.py
View file @
21d5daa4
...
@@ -395,6 +395,9 @@ class ModelRunner:
...
@@ -395,6 +395,9 @@ class ModelRunner:
"unexpected consequences if the model is not static. To "
"unexpected consequences if the model is not static. To "
"run the model in eager mode, set 'enforce_eager=True' or "
"run the model in eager mode, set 'enforce_eager=True' or "
"use '--enforce-eager' in the CLI."
)
"use '--enforce-eager' in the CLI."
)
logger
.
info
(
"CUDA graphs can take additional 1~3 GiB memory per GPU. "
"If you are running out of memory, consider decreasing "
"`gpu_memory_utilization` or enforcing eager mode."
)
start_time
=
time
.
perf_counter
()
start_time
=
time
.
perf_counter
()
# Prepare dummy inputs. These will be reused for all batch sizes.
# Prepare dummy inputs. These will be reused for all batch sizes.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment