Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
70e13224
Unverified
Commit
70e13224
authored
Mar 28, 2025
by
Woosuk Kwon
Committed by
GitHub
Mar 28, 2025
Browse files
[Minor] Remove TGI launching script (#15646)
Signed-off-by:
Woosuk Kwon
<
woosuk.kwon@berkeley.edu
>
parent
47e9038d
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
0 additions
and
22 deletions
+0
-22
benchmarks/benchmark_serving.py
benchmarks/benchmark_serving.py
+0
-3
benchmarks/benchmark_serving_structured_output.py
benchmarks/benchmark_serving_structured_output.py
+0
-3
benchmarks/launch_tgi_server.sh
benchmarks/launch_tgi_server.sh
+0
-16
No files found.
benchmarks/benchmark_serving.py
View file @
70e13224
...
@@ -7,9 +7,6 @@ On the server side, run one of the following commands:
...
@@ -7,9 +7,6 @@ On the server side, run one of the following commands:
--swap-space 16 \
--swap-space 16 \
--disable-log-requests
--disable-log-requests
(TGI backend)
./launch_tgi_server.sh <your_model> <max_batch_total_tokens>
On the client side, run:
On the client side, run:
python benchmarks/benchmark_serving.py \
python benchmarks/benchmark_serving.py \
--backend <backend> \
--backend <backend> \
...
...
benchmarks/benchmark_serving_structured_output.py
View file @
70e13224
...
@@ -5,9 +5,6 @@ On the server side, run one of the following commands:
...
@@ -5,9 +5,6 @@ On the server side, run one of the following commands:
(vLLM OpenAI API server)
(vLLM OpenAI API server)
vllm serve <your_model> --disable-log-requests
vllm serve <your_model> --disable-log-requests
(TGI backend)
./launch_tgi_server.sh <your_model> <max_batch_total_tokens>
On the client side, run:
On the client side, run:
python benchmarks/benchmark_serving_structured_output.py \
python benchmarks/benchmark_serving_structured_output.py \
--backend <backend> \
--backend <backend> \
...
...
benchmarks/launch_tgi_server.sh
deleted
100755 → 0
View file @
47e9038d
#!/bin/bash
PORT
=
8000
MODEL
=
$1
TOKENS
=
$2
docker run
-e
"HF_TOKEN=
$HF_TOKEN
"
--gpus
all
--shm-size
1g
-p
$PORT
:80
\
-v
"
$PWD
/data:/data"
\
ghcr.io/huggingface/text-generation-inference:2.2.0
\
--model-id
"
$MODEL
"
\
--sharded
false
\
--max-input-length
1024
\
--max-total-tokens
2048
\
--max-best-of
5
\
--max-concurrent-requests
5000
\
--max-batch-total-tokens
"
$TOKENS
"
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment