Commit 6cedbed7 authored by laibao's avatar laibao
Browse files

Update README.md

parent 96767828
......@@ -139,7 +139,7 @@ python benchmarks/benchmark_serving.py --model mixtral/Mixtral-8x7B-Instruct-v0.
启动服务:
```bash
vllm serve mixtral/Mixtral-8x7B-Instruct-v0.1 --enforce-eager --dtype float16 --trust-remote-code --port 8000
vllm serve mixtral/Mixtral-8x7B-Instruct-v0.1 --enforce-eager --dtype float16 --trust-remote-code --port 8000 -tp4
```
这里serve之后 为加载模型路径,`--dtype`为数据类型:float16,默认情况使用tokenizer中的预定义聊天模板,`--chat-template`可以添加新模板覆盖默认模板,`-q gptq`为使用gptq量化模型进行推理。
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment