Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
c3af4472
Unverified
Commit
c3af4472
authored
May 20, 2024
by
Kuntai Du
Committed by
GitHub
May 20, 2024
Browse files
[Doc]Add documentation to benchmarking script when running TGI (#4920)
parent
1937e298
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
5 additions
and
1 deletion
+5
-1
benchmarks/benchmark_serving.py
benchmarks/benchmark_serving.py
+4
-0
benchmarks/launch_tgi_server.sh
benchmarks/launch_tgi_server.sh
+1
-1
No files found.
benchmarks/benchmark_serving.py
View file @
c3af4472
...
...
@@ -17,6 +17,10 @@ On the client side, run:
--dataset-path <path to dataset>
\
--request-rate <request_rate> \ # By default <request_rate> is inf
--num-prompts <num_prompts> # By default <num_prompts> is 1000
when using tgi backend, add
--endpoint /generate_stream
to the end of the command above.
"""
import
argparse
import
asyncio
...
...
benchmarks/launch_tgi_server.sh
View file @
c3af4472
...
...
@@ -4,7 +4,7 @@ PORT=8000
MODEL
=
$1
TOKENS
=
$2
docker run
--gpus
all
--shm-size
1g
-p
$PORT
:80
\
docker run
-e
HF_TOKEN
=
$HF_TOKEN
--gpus
all
--shm-size
1g
-p
$PORT
:80
\
-v
$PWD
/data:/data
\
ghcr.io/huggingface/text-generation-inference:1.4.0
\
--model-id
$MODEL
\
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment