Commit bfb39f34 authored by raojy's avatar raojy 💬
Browse files

Update README.md

parent 09c6cf7c
...@@ -182,7 +182,7 @@ curl http://localhost:8001/v1/chat/completions \ ...@@ -182,7 +182,7 @@ curl http://localhost:8001/v1/chat/completions \
### SGLang ### SGLang
#### 单机推理 #### 单机推理
##### BF16 ##### BF16
1. serve启动,以`Qwen/Qwen3.6-27B`为例(此命令适用于非K100AI芯片) 1. serve启动,以`Qwen3.5-35B-A3B`为例(此命令适用于非K100AI芯片)
```bash ```bash
export SGLANG_ENABLE_SPEC_V2=1 export SGLANG_ENABLE_SPEC_V2=1
export SGLANG_USE_FUSED_TOPK_SOFTMAX=1 export SGLANG_USE_FUSED_TOPK_SOFTMAX=1
...@@ -202,7 +202,7 @@ sglang serve --model-path Qwen/Qwen3.5-35B-A3B \ ...@@ -202,7 +202,7 @@ sglang serve --model-path Qwen/Qwen3.5-35B-A3B \
--trust-remote-code \ --trust-remote-code \
--chunked-prefill-size -1 --context-length 8192 --chunked-prefill-size -1 --context-length 8192
``` ```
2. serve启动,以`Qwen/Qwen3.6-27B`为例(此命令适用于K100AI芯片) 2. serve启动,以`Qwen3.5-35B-A3B`为例(此命令适用于K100AI芯片)
```bash ```bash
export SGLANG_ENABLE_SPEC_V2=1 export SGLANG_ENABLE_SPEC_V2=1
export SGLANG_USE_FUSED_TOPK_SOFTMAX=1 export SGLANG_USE_FUSED_TOPK_SOFTMAX=1
...@@ -230,7 +230,7 @@ sglang serve --model-path Qwen/Qwen3.5-35B-A3B \ ...@@ -230,7 +230,7 @@ sglang serve --model-path Qwen/Qwen3.5-35B-A3B \
curl http://localhost:8001/v1/chat/completions \ curl http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{ -d '{
"model": "Qwen/Qwen3.6-27B-FP8", "model": "Qwen/Qwen3.5-35B-A3B",
"messages": [ "messages": [
{"role": "user", "content": "Type \"I love Qwen3.5\" backwards"} {"role": "user", "content": "Type \"I love Qwen3.5\" backwards"}
], ],
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment