Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
616e600e
Unverified
Commit
616e600e
authored
May 28, 2024
by
Marut Pandya
Committed by
GitHub
May 28, 2024
Browse files
[Misc] add gpu_memory_utilization arg (#5079)
Signed-off-by:
pandyamarut
<
pandyamarut@gmail.com
>
parent
dfba529b
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
8 additions
and
1 deletion
+8
-1
benchmarks/benchmark_latency.py
benchmarks/benchmark_latency.py
+8
-1
No files found.
benchmarks/benchmark_latency.py
View file @
616e600e
...
@@ -35,7 +35,8 @@ def main(args: argparse.Namespace):
...
@@ -35,7 +35,8 @@ def main(args: argparse.Namespace):
use_v2_block_manager
=
args
.
use_v2_block_manager
,
use_v2_block_manager
=
args
.
use_v2_block_manager
,
enable_chunked_prefill
=
args
.
enable_chunked_prefill
,
enable_chunked_prefill
=
args
.
enable_chunked_prefill
,
download_dir
=
args
.
download_dir
,
download_dir
=
args
.
download_dir
,
block_size
=
args
.
block_size
)
block_size
=
args
.
block_size
,
gpu_memory_utilization
=
args
.
gpu_memory_utilization
)
sampling_params
=
SamplingParams
(
sampling_params
=
SamplingParams
(
n
=
args
.
n
,
n
=
args
.
n
,
...
@@ -214,5 +215,11 @@ if __name__ == '__main__':
...
@@ -214,5 +215,11 @@ if __name__ == '__main__':
type
=
str
,
type
=
str
,
default
=
None
,
default
=
None
,
help
=
'Path to save the latency results in JSON format.'
)
help
=
'Path to save the latency results in JSON format.'
)
parser
.
add_argument
(
'--gpu-memory-utilization'
,
type
=
float
,
default
=
0.9
,
help
=
'the fraction of GPU memory to be used for '
'the model executor, which can range from 0 to 1.'
'If unspecified, will use the default value of 0.9.'
)
args
=
parser
.
parse_args
()
args
=
parser
.
parse_args
()
main
(
args
)
main
(
args
)
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment