Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
jerrrrry
vllm_test_tools
Commits
36816c8e
Commit
36816c8e
authored
Jun 17, 2025
by
jerrrrry
Browse files
Update README.md
parent
6cd6b13d
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
58 additions
and
0 deletions
+58
-0
README.md
README.md
+58
-0
No files found.
README.md
View file @
36816c8e
# 0.8.5
1.
Offline推理 根据需求自定义参数
benchmark_throughput_0.8.5.py
使用如下脚本可以减少不同参数推理时反复load模型
batch prompt_tokens completion_tokens可以用空格分隔传成字符串
其他参数与标准脚本一致
<pre>
export HIP_VISIBLE_DEVICES=1
tp=1
model_path=/llm-models/qwen1.5/Qwen1.5-0.5B-Chat
batch="1 2"
prompt_tokens="16 64"
completion_tokens="128 256"
python benchmark_throughput_0.8.5.py --model ${model_path} --tensor-parallel-size ${tp} --num-prompts ${batch} --input-len ${prompt_tokens} --output-len ${completion_tokens}
\
--dtype float16 --trust-remote-code --max-model-len 32768 --output-json ./test_0.5B-0.7.2.txt
</pre>
按照如上传参,则计算的场景如下:
bs input output
1 16 128
1 64 256
2 16 128
2 64 256
推理结果汇总在--output-json ./test_0.5B-0.7.2.txt当中,示例如下:
bash
bs_in_out,elapsed_time,Throughput,total_tokens,output_tokens,ttft_mean,ttft_median,ttft_p99,tpop_mean,tpop_median,tpop_p99,output_token_throughput_mean,output_token_throughput_median,output_token_throughput_p99,inout_token_throughput_mean,inout_token_throughput_median,inout_token_throughput_p99
1_16_128,3.49,0.29,41.26,36.68,0.03801,0.03801,0.03801,0.0269,0.02691,0.02691,37.04,37.04,37.04,41.66,41.66,41.66
1_64_256,7.14,0.14,44.82,35.85,0.0291,0.0291,0.0291,0.0278,0.02776,0.02776,36.01,36.01,36.01,45.01,45.01,45.01
2_16_128,3.62,0.55,79.56,70.72,0.04829,0.04829,0.04893,0.028,0.02801,0.02801,35.51,35.51,35.51,39.94,39.94,39.95
2_64_256,7.31,0.27,87.55,70.04,0.04697,0.04697,0.04764,0.0284,0.02836,0.02836,35.17,35.17,35.18,43.97,43.97,43.97
2.
Server推理
先 bash server.sh 等待服务起来后 再bash test.sh 根据需求修改测试参数
# 0.7.2
1.
Offline推理
benchmark_throughput_0.7.2.py
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment