Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
82c25151
Unverified
Commit
82c25151
authored
Oct 18, 2024
by
Joe Runde
Committed by
GitHub
Oct 19, 2024
Browse files
[Doc] update gpu-memory-utilization flag docs (#9507)
Signed-off-by:
Joe Runde
<
Joseph.Runde@ibm.com
>
parent
1325872e
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
1 deletion
+5
-1
vllm/engine/arg_utils.py
vllm/engine/arg_utils.py
+5
-1
No files found.
vllm/engine/arg_utils.py
View file @
82c25151
...
@@ -428,7 +428,11 @@ class EngineArgs:
...
@@ -428,7 +428,11 @@ class EngineArgs:
help
=
'The fraction of GPU memory to be used for the model '
help
=
'The fraction of GPU memory to be used for the model '
'executor, which can range from 0 to 1. For example, a value of '
'executor, which can range from 0 to 1. For example, a value of '
'0.5 would imply 50%% GPU memory utilization. If unspecified, '
'0.5 would imply 50%% GPU memory utilization. If unspecified, '
'will use the default value of 0.9.'
)
'will use the default value of 0.9. This is a global gpu memory '
'utilization limit, for example if 50%% of the gpu memory is '
'already used before vLLM starts and --gpu-memory-utilization is '
'set to 0.9, then only 40%% of the gpu memory will be allocated '
'to the model executor.'
)
parser
.
add_argument
(
parser
.
add_argument
(
'--num-gpu-blocks-override'
,
'--num-gpu-blocks-override'
,
type
=
int
,
type
=
int
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment