Update README.md

be3814d9 · jerrrrry · 3f591513 · be3814d9
Commit be3814d9 authored May 07, 2025 by jerrrrry
Show whitespace changes
Inline Side-by-side

Showing with 4 additions and 1 deletion

README.md README.md +4 -1

No files found.
--- a/README.md
+++ b/README.md
+# 环境配置
+1. 拉取镜像，创建容器，安装基础依赖包
 vllm测试
 0.7.2
 Offline推理
@@ -5,7 +7,7 @@ benchmark_throughput_0.7.2.py
 使用如下脚本可以减少不同参数推理时反复load模型
 batch prompt_tokens completion_tokens可以用空格分隔传成字符串
 其他参数与标准脚本一致
-bash
+<pre>
 export HIP_VISIBLE_DEVICES=1
 tp=1
 model_path=/llm-models/qwen1.5/Qwen1.5-0.5B-Chat
@@ -15,6 +17,7 @@ prompt_tokens="16 64"
 completion_tokens="128 256"
 python benchmark_throughput_0.7.2.py --model ${model_path} --tensor-parallel-size ${tp} --num-prompts ${batch} --input-len ${prompt_tokens} --output-len ${completion_tokens} \
    --dtype float16  --trust-remote-code --max-model-len 32768 --output-json ./test_0.5B-0.7.2.txt
+</pre>

 按照如上传参，则计算的场景如下：
 bs    input    output