Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
98a42e70
Unverified
Commit
98a42e70
authored
Mar 28, 2024
by
Yile (Michael) Gu
Committed by
GitHub
Mar 28, 2024
Browse files
[Benchmark] Change mii to use persistent deployment and support tensor parallel (#3628)
parent
0267fef5
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
3 deletions
+5
-3
benchmarks/benchmark_throughput.py
benchmarks/benchmark_throughput.py
+5
-3
No files found.
benchmarks/benchmark_throughput.py
View file @
98a42e70
...
@@ -183,13 +183,15 @@ def run_mii(
...
@@ -183,13 +183,15 @@ def run_mii(
tensor_parallel_size
:
int
,
tensor_parallel_size
:
int
,
output_len
:
int
,
output_len
:
int
,
)
->
float
:
)
->
float
:
from
mii
import
pipelin
e
from
mii
import
client
,
serv
e
llm
=
pipelin
e
(
model
,
tensor_parallel
=
tensor_parallel_size
)
llm
=
serv
e
(
model
,
tensor_parallel
=
tensor_parallel_size
)
prompts
=
[
prompt
for
prompt
,
_
,
_
in
requests
]
prompts
=
[
prompt
for
prompt
,
_
,
_
in
requests
]
start
=
time
.
perf_counter
()
start
=
time
.
perf_counter
()
llm
(
prompts
,
max_new_tokens
=
output_len
)
llm
.
generate
(
prompts
,
max_new_tokens
=
output_len
)
end
=
time
.
perf_counter
()
end
=
time
.
perf_counter
()
client
=
client
(
model
)
client
.
terminate_server
()
return
end
-
start
return
end
-
start
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment