Commit c0bc033b authored by zhuwenwen's avatar zhuwenwen
Browse files

add env

parent b4fa15d2
......@@ -95,6 +95,7 @@ cd dist && pip install vllm*
| [Qwen2-72B](https://huggingface.co/Qwen/Qwen2-72B) | [Qwen2-72B-Instruct](https://huggingface.co/Qwen/Qwen2-72B-Instruct) | [Qwen2-72B-Instruct-GPTQ-Int4](https://huggingface.co/Qwen/Qwen2-72B-Instruct-GPTQ-Int4) |
### 离线批量推理
```bash
python vllm/examples/offline_inference.py
......@@ -104,6 +105,7 @@ python vllm/examples/offline_inference.py
### 离线批量推理性能测试
`Tips:若测试qwen1.5-7b/qwen1.5-72b/qwen1.5-72b,添加环境变量LLAMA_NN=1`
1、指定输入输出
```bash
python vllm/benchmarks/benchmark_throughput.py --num-prompts 1 --input-len 32 --output-len 128 --model Qwen/Qwen1.5-7B-Chat -tp 1 --trust-remote-code --enforce-eager --dtype float16
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment