Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Qwen3-Omni_vllm
Commits
5817b427
Commit
5817b427
authored
Apr 02, 2026
by
raojy
💬
Browse files
Update README.md
parent
256bf4a6
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
4 deletions
+2
-4
README.md
README.md
+2
-4
No files found.
README.md
View file @
5817b427
...
...
@@ -9,7 +9,7 @@ Qwen3-Omni 是一款原生的端到端全模态基座模型,具备对文本、
在架构创新上,Qwen3-Omni 采用了基于 MoE(混合专家模型)的 “Thinker–Talker” 设计,结合 AuT 预训练与多码本技术,显著降低了推理延迟。 这种先进的设计使其能够支持极低延迟的流式音视频实时交互,实现自然的对话轮替与即时反馈。此外,模型提供了灵活的系统提示词控制机制,并同步开源了高精度的 30B 级别音频描述器(Captioner),为开源社区在全模态实时感知与复杂任务处理领域提供了强有力的支持。
<div
align=
center
>
<img
src=
"./doc/
qwen3.5_397b_a17b_infra.jp
g"
/>
<img
src=
"./doc/
arc2.pn
g"
/>
</div>
## 环境依赖
...
...
@@ -62,8 +62,6 @@ pip install numpy==1.25.0
### vllm
#### 单机推理
**注意**
:使用
`K100 AI`
启动服务时需要添加
`--disable-custom-all-reduce`
参数,加载W8A8模型启动服务时需要添加
`-cc.mode=3`
和
`-cc.inductor_compile_config='{"combo_kernels": false, "benchmark_combo_kernel": false}'`
```
bash
## serve启动
...
...
@@ -104,7 +102,7 @@ curl http://localhost:8000/v1/chat/completions \
## 效果展示
<div
align=
center
>
<img
src=
"./doc/
result-dcu.jp
g"
/>
<img
src=
"./doc/
1.pn
g"
/>
</div>
### 精度
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment