Commit bfb39f34 authored by raojy's avatar raojy 💬
Browse files

Update README.md

parent 09c6cf7c
......@@ -182,7 +182,7 @@ curl http://localhost:8001/v1/chat/completions \
### SGLang
#### 单机推理
##### BF16
1. serve启动,以`Qwen/Qwen3.6-27B`为例(此命令适用于非K100AI芯片)
1. serve启动,以`Qwen3.5-35B-A3B`为例(此命令适用于非K100AI芯片)
```bash
export SGLANG_ENABLE_SPEC_V2=1
export SGLANG_USE_FUSED_TOPK_SOFTMAX=1
......@@ -202,7 +202,7 @@ sglang serve --model-path Qwen/Qwen3.5-35B-A3B \
--trust-remote-code \
--chunked-prefill-size -1 --context-length 8192
```
2. serve启动,以`Qwen/Qwen3.6-27B`为例(此命令适用于K100AI芯片)
2. serve启动,以`Qwen3.5-35B-A3B`为例(此命令适用于K100AI芯片)
```bash
export SGLANG_ENABLE_SPEC_V2=1
export SGLANG_USE_FUSED_TOPK_SOFTMAX=1
......@@ -230,7 +230,7 @@ sglang serve --model-path Qwen/Qwen3.5-35B-A3B \
curl http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen3.6-27B-FP8",
"model": "Qwen/Qwen3.5-35B-A3B",
"messages": [
{"role": "user", "content": "Type \"I love Qwen3.5\" backwards"}
],
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment