Commit 7a751c0d authored by luopl's avatar luopl
Browse files

add K100 AI

parent d6e82889
# Qwen3.5_vllm
# Qwen3.5
## 论文
[Qwen3.5](https://qwen.ai/blog?id=qwen3.5)
......@@ -58,6 +58,30 @@ pip install numpy==1.25.0
## 推理
### vllm
#### 单机推理
**注意**:使用`K100 AI` 集群启动服务时需要添加`--disable-custom-all-reduce`参数
```bash
## serve启动
vllm serve Qwen/Qwen3.5-35B-A3B \
--port 8001 \
--tensor-parallel-size 16 \
--max-model-len 262144 \
--reasoning-parser qwen3
## client访问
curl http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen3.5-35B-A3B",
"messages": [
{"role": "user", "content": "Type \"I love Qwen3.5\" backwards"}
],
"temperature": 0.6
}'
```
#### 多机推理
1. 加入环境变量
> 请注意:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment