Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
ac5b78ba
"examples/avsr/data_prep/data/data_module.py" did not exist on "77cdd160f15f63f18c35e7f2d885dcc14e85a846"
Unverified
Commit
ac5b78ba
authored
Apr 14, 2025
by
Yineng Zhang
Committed by
GitHub
Apr 14, 2025
Browse files
fix: update test config (#5392)
parent
38076dea
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
7 additions
and
7 deletions
+7
-7
test/srt/configs/llama_405b.yaml
test/srt/configs/llama_405b.yaml
+7
-7
No files found.
test/srt/configs/llama_405b.yaml
View file @
ac5b78ba
tasks
:
-
name
:
sglang-8192-1024-concurrency1
server_cmd
:
python3 -m sglang.launch_server --model nvidia/Llama-3.1-405B-Instruct-FP8 --tp
8
client_cmd
:
python3 -m sglang.bench_serving --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 1 --num-prompts 5 --output-file llama_405b_results.jsonl
client_cmd
:
python3 -m sglang.bench_serving
--dataset-name random
--random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 1 --num-prompts 5 --output-file llama_405b_results.jsonl
-
name
:
sglang-8192-1024-concurrency2
server_cmd
:
python3 -m sglang.launch_server --model nvidia/Llama-3.1-405B-Instruct-FP8 --tp
8
client_cmd
:
python3 -m sglang.bench_serving --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 2 --num-prompts 10 --output-file llama_405b_results.jsonl
client_cmd
:
python3 -m sglang.bench_serving
--dataset-name random
--random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 2 --num-prompts 10 --output-file llama_405b_results.jsonl
-
name
:
sglang-8192-1024-concurrency4
server_cmd
:
python3 -m sglang.launch_server --model nvidia/Llama-3.1-405B-Instruct-FP8 --tp
8
client_cmd
:
python3 -m sglang.bench_serving --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 4 --num-prompts 20 --output-file llama_405b_results.jsonl
client_cmd
:
python3 -m sglang.bench_serving
--dataset-name random
--random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 4 --num-prompts 20 --output-file llama_405b_results.jsonl
-
name
:
sglang-8192-1024-concurrency8
server_cmd
:
python3 -m sglang.launch_server --model nvidia/Llama-3.1-405B-Instruct-FP8 --tp
8
client_cmd
:
python3 -m sglang.bench_serving --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 8 --num-prompts 32 --output-file llama_405b_results.jsonl
client_cmd
:
python3 -m sglang.bench_serving
--dataset-name random
--random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 8 --num-prompts 32 --output-file llama_405b_results.jsonl
-
name
:
sglang-8192-1024-concurrency16
server_cmd
:
python3 -m sglang.launch_server --model nvidia/Llama-3.1-405B-Instruct-FP8 --tp
8
client_cmd
:
python3 -m sglang.bench_serving --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 16 --num-prompts 48 --output-file llama_405b_results.jsonl
client_cmd
:
python3 -m sglang.bench_serving
--dataset-name random
--random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 16 --num-prompts 48 --output-file llama_405b_results.jsonl
-
name
:
sglang-8192-1024-concurrency24
server_cmd
:
python3 -m sglang.launch_server --model nvidia/Llama-3.1-405B-Instruct-FP8 --tp
8
client_cmd
:
python3 -m sglang.bench_serving --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 24 --num-prompts 72 --output-file llama_405b_results.jsonl
client_cmd
:
python3 -m sglang.bench_serving
--dataset-name random
--random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 24 --num-prompts 72 --output-file llama_405b_results.jsonl
-
name
:
sglang-8192-1024-concurrency32
server_cmd
:
python3 -m sglang.launch_server --model nvidia/Llama-3.1-405B-Instruct-FP8 --tp
8
client_cmd
:
python3 -m sglang.bench_serving --random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 32 --num-prompts 96 --output-file llama_405b_results.jsonl
client_cmd
:
python3 -m sglang.bench_serving
--dataset-name random
--random-range-ratio 1 --random-input-len 8192 --random-output-len 1024 --max-concurrency 32 --num-prompts 96 --output-file llama_405b_results.jsonl
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment