Commit 78b09731 authored by chenzk's avatar chenzk
Browse files

v1.0.4

parent 670bcfcb
......@@ -125,11 +125,11 @@ llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
# 方法二:vllm 推理
# 先安装新版vllm
pip install whl/vllm-0.6.2+das.opt1.85def94.dtk24042-cp310-cp310-linux_x86_64.whl
pip install whl/vllm-0.6.2+das.opt1.ac9aae1.dtk24042-cp310-cp310-linux_x86_64.whl
pip install whl/flash_attn-2.6.1+das.opt2.08f8827.dtk24042-cp310-cp310-linux_x86_64.whl
export LM_NN=0
# 推理
python infer_vllm.py # 后期可从光合开发者社区下载性能优化更好的vllm推理。
# 若无法成功调用vllm,在终端输入命令:export LM_NN=0
```
## result
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment