Commit 59edee96 authored by weishb's avatar weishb
Browse files

新增transformers推理说明

parent 997d1621
......@@ -109,6 +109,8 @@ response = tokenizer.decode(
print(response)
```
**如果用transformers推理Spark-Scilit-X1-13B,需要额外修改模型的config.json文件,将"_attn_implementation":"flash_attention_2"改成"_attn_implementation":"eager"**
### vLLM
#### 单机推理
```bash
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment