Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
5156d5a4
"vscode:/vscode.git/clone" did not exist on "3d179846d51d9695ed1b196c11210e17688c24e3"
Unverified
Commit
5156d5a4
authored
Apr 20, 2025
by
Baizhou Zhang
Committed by
GitHub
Apr 20, 2025
Browse files
Add test config yamls for Deepseek v3 (#5433)
parent
c951d312
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
57 additions
and
0 deletions
+57
-0
test/srt/configs/deepseek_v3.yaml
test/srt/configs/deepseek_v3.yaml
+28
-0
test/srt/configs/deepseek_v3_long_context.yaml
test/srt/configs/deepseek_v3_long_context.yaml
+28
-0
test/srt/parse_results.py
test/srt/parse_results.py
+1
-0
No files found.
test/srt/configs/deepseek_v3.yaml
0 → 100644
View file @
5156d5a4
tasks
:
-
name
:
sglang-8192-1024-concurrency1
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 1 --num-prompts 5 --output-file deepseek_v3_results.jsonl
-
name
:
sglang-8192-1024-concurrency2
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 2 --num-prompts 10 --output-file deepseek_v3_results.jsonl
-
name
:
sglang-8192-1024-concurrency4
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 4 --num-prompts 20 --output-file deepseek_v3_results.jsonl
-
name
:
sglang-8192-1024-concurrency8
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 8 --num-prompts 32 --output-file deepseek_v3_results.jsonl
-
name
:
sglang-8192-1024-concurrency16
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 16 --num-prompts 48 --output-file deepseek_v3_results.jsonl
-
name
:
sglang-8192-1024-concurrency24
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 24 --num-prompts 72 --output-file deepseek_v3_results.jsonl
-
name
:
sglang-8192-1024-concurrency32
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 32 --num-prompts 96 --output-file deepseek_v3_results.jsonl
test/srt/configs/deepseek_v3_long_context.yaml
0 → 100644
View file @
5156d5a4
tasks
:
-
name
:
sglang-32000-100-concurrency1
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 32000 --random-output-len 100 --max-concurrency 1 --num-prompts 5 --output-file deepseek_v3_long_context_results.jsonl
-
name
:
sglang-32000-100-concurrency2
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 32000 --random-output-len 100 --max-concurrency 2 --num-prompts 10 --output-file deepseek_v3_long_context_results.jsonl
-
name
:
sglang-32000-100-concurrency4
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 32000 --random-output-len 100 --max-concurrency 4 --num-prompts 20 --output-file deepseek_v3_long_context_results.jsonl
-
name
:
sglang-32000-100-concurrency8
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 32000 --random-output-len 100 --max-concurrency 8 --num-prompts 32 --output-file deepseek_v3_long_context_results.jsonl
-
name
:
sglang-32000-100-concurrency16
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 32000 --random-output-len 100 --max-concurrency 16 --num-prompts 48 --output-file deepseek_v3_long_context_results.jsonl
-
name
:
sglang-32000-100-concurrency24
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 32000 --random-output-len 100 --max-concurrency 24 --num-prompts 72 --output-file deepseek_v3_long_context_results.jsonl
-
name
:
sglang-32000-100-concurrency32
server_cmd
:
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3-0324 --tp 8 --trust-remote-code --disable-radix-cache --max-prefill-tokens
32768
client_cmd
:
python3 -m sglang.bench_serving --dataset-name random --random-range-ratio 1 --random-input-len 32000 --random-output-len 100 --max-concurrency 32 --num-prompts 96 --output-file deepseek_v3_long_context_results.jsonl
test/srt/parse_results.py
View file @
5156d5a4
...
...
@@ -16,6 +16,7 @@ output_file = f"{base_name}_summary.csv"
fields
=
[
"max_concurrency"
,
"input_throughput"
,
"output_throughput"
,
"mean_ttft_ms"
,
"median_ttft_ms"
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment