Update README.md

96767828 · laibao · 0b27735c · 96767828
Commit 96767828 authored Jan 02, 2025 by laibao
Hide whitespace changes
Inline Side-by-side

Showing with 3 additions and 3 deletions

README.md README.md +3 -3

No files found.
--- a/README.md
+++ b/README.md
@@ -100,7 +100,7 @@ python examples/offline_inference.py
 1、指定输入输出

 ```bash
-python benchmarks/benchmark_throughput.py --num-prompts 1 --input-len 32 --output-len 128 --model mixtral/Mixtral-8x7B-Instruct-v0.1 -tp 1 --trust-remote-code --enforce-eager --dtype float16
+python benchmarks/benchmark_throughput.py --num-prompts 1 --input-len 32 --output-len 128 --model mixtral/Mixtral-8x7B-Instruct-v0.1 -tp 4 --trust-remote-code --enforce-eager --dtype float16
 ```

 其中 `--num-prompts`是batch数，`--input-len`是输入seqlen，`--output-len`是输出token长度，`--model`为模型路径，`-tp`为使用卡数，`dtype="float16"`为推理数据类型，如果模型权重是bfloat16,需要修改为float16推理。若指定 `--output-len  1`即为首字延迟。`-q gptq`为使用gptq量化模型进行推理。
@@ -113,7 +113,7 @@ wget http://113.200.138.88:18080/aidatasets/vllm_data/-/raw/main/ShareGPT_V3_unf
 ```

 ```bash
-python benchmarks/benchmark_throughput.py --num-prompts 1 --model mixtral/Mixtral-8x7B-Instruct-v0.1 --dataset ShareGPT_V3_unfiltered_cleaned_split.json -tp 1 --trust-remote-code --enforce-eager --dtype float16
+python benchmarks/benchmark_throughput.py --num-prompts 1 --model mixtral/Mixtral-8x7B-Instruct-v0.1 --dataset ShareGPT_V3_unfiltered_cleaned_split.json -tp 4 --trust-remote-code --enforce-eager --dtype float16
 ```

 其中 `--num-prompts`是batch数，`--model`为模型路径，`--dataset`为使用的数据集，`-tp`为使用卡数，`dtype="float16"`为推理数据类型，如果模型权重是bfloat16,需要修改为float16推理。`-q gptq`为使用gptq量化模型进行推理。
@@ -123,7 +123,7 @@ python benchmarks/benchmark_throughput.py --num-prompts 1 --model mixtral/Mixtra
 1、启动服务端：

 ```bash
-python -m vllm.entrypoints.openai.api_server  --model mixtral/Mixtral-8x7B-Instruct-v0.1  --dtype float16 --enforce-eager -tp 1 
+python -m vllm.entrypoints.openai.api_server  --model mixtral/Mixtral-8x7B-Instruct-v0.1  --dtype float16 --enforce-eager -tp 4 
 ```

 2、启动客户端：