Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Qwen3-Guard_vllm
Commits
ffdbcc0d
"git@developer.sourcefind.cn:change/sglang.git" did not exist on "eb38c7d1cae1c616de3c1c0ce40353c720f7e3c7"
Commit
ffdbcc0d
authored
Dec 05, 2025
by
dengjb
Browse files
update
parent
d2d98a18
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
7 deletions
+6
-7
README.md
README.md
+6
-7
No files found.
README.md
View file @
ffdbcc0d
...
@@ -30,7 +30,7 @@ Qwen3Guard-Gen,它提供了以下主要优势:
...
@@ -30,7 +30,7 @@ Qwen3Guard-Gen,它提供了以下主要优势:
| flash_attn | 2.6.1+das.opt1.dtk2504 |
| flash_attn | 2.6.1+das.opt1.dtk2504 |
| flash_mla | 1.0.0+das.opt1.dtk25042 |
| flash_mla | 1.0.0+das.opt1.dtk25042 |
当前仅支持
镜像:
推荐使用
镜像:
-
挂载地址
`-v`
根据实际模型情况修改
-
挂载地址
`-v`
根据实际模型情况修改
```
bash
```
bash
...
@@ -50,20 +50,19 @@ docker run -it --shm-size 60g --network=host --name qwen3-guard --privileged --d
...
@@ -50,20 +50,19 @@ docker run -it --shm-size 60g --network=host --name qwen3-guard --privileged --d
### vllm
### vllm
#### 单机推理
#### 单机推理
可参考vllm_serve.sh脚本
```
bash
```
bash
## serve启动
## serve启动
## 可参考vllm_serve.sh脚本
vllm serve /path/of/Qwen/Qwen3Guard-Gen-8B/
\
vllm serve /path/of/Qwen/Qwen3Guard-Gen-8B/
\
--trust-remote-code
\
--trust-remote-code
\
--max-model-len
32768
\
--max-model-len
32768
\
--served-model-name
qwen3-guard
\
--served-model-name
qwen3-guard
\
--dtype
bfloat16
\
--dtype
bfloat16
\
-tp
2
-tp
1
## client访问
## client访问
可参考vllm_cilent.sh
##
可参考vllm_cilent.sh
curl http://localhost:8000/v1/chat/completions
-H
"Content-Type: application/json"
-d
'{
curl http://localhost:8000/v1/chat/completions
-H
"Content-Type: application/json"
-d
'{
"model": "qwen3-guard",
"model": "qwen3-guard",
"messages": [
"messages": [
...
@@ -93,7 +92,7 @@ DCU与GPU精度一致,推理框架:vllm。
...
@@ -93,7 +92,7 @@ DCU与GPU精度一致,推理框架:vllm。
## 预训练权重
## 预训练权重
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 |下载地址|
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 |下载地址|
|:-----:|:----------:|:----------:|:---------------------:|:----------:|
|:-----:|:----------:|:----------:|:---------------------:|:----------:|
| Qwen3Guard-Gen-8B | 8B | BW1000 | 1 |
[
下载地址
](
https://modelscope.cn/models/Qwen/Qwen3Guard-Gen-8B
)
|
| Qwen3Guard-Gen-8B | 8B | BW1000 | 1 |
[
modelscope
](
https://modelscope.cn/models/Qwen/Qwen3Guard-Gen-8B
)
|
## 源码仓库及问题反馈
## 源码仓库及问题反馈
-
https://developer.sourcefind.cn/codes/modelzoo/qwen3-guard_vllm
-
https://developer.sourcefind.cn/codes/modelzoo/qwen3-guard_vllm
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment