Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Qwen3-VL-Reranker_vllm
Commits
2d334ddf
Commit
2d334ddf
authored
Apr 29, 2026
by
weishb
Browse files
更新Readme版本
parent
679dff1b
Pipeline
#3549
failed with stages
in 0 seconds
Changes
3
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
14 additions
and
12 deletions
+14
-12
LICENSE
LICENSE
+0
-0
README.md
README.md
+13
-12
requirements.txt
requirements.txt
+1
-0
No files found.
LICENSE
.txt
→
LICENSE
View file @
2d334ddf
File moved
README.md
View file @
2d334ddf
...
...
@@ -44,11 +44,18 @@ docker run -it \
关于本项目DCU显卡所需的特殊深度学习库可从
[
光合
](
https://developer.sourcefind.cn/tool/
)
开发者社区下载安装
镜像内其他环境配置
其它包参照requirements.txt安装:
```
pip install
pycountry
pip install
-r requirements.txt
```
## 预训练权重
**请根据`支持的DCU型号`选择对应模型下载,FP8模型仅在BW1100/BW1101上支持,其他型号请勿使用!**
| 模型名称 | 权重大小 | 数据类型 | 支持的DCU型号 | 最低卡数需求 | 下载地址 |
|:-----:|:----------:|:----------:|:----------:|:---------------------:|:----------:|
| Qwen3-VL-Reranker-8B | 8B | BF16 | K100AI | 1 |
[
ModelScope
](
https://www.modelscope.cn/models/Qwen/Qwen3-VL-Reranker-8B
)
|
| Qwen3-VL-Reranker-2B | 2B | BF16 | K100AI | 1 |
[
ModelScope
](
https://www.modelscope.cn/models/Qwen/Qwen3-VL-Reranker-2B
)
|
## 数据集
暂无
...
...
@@ -63,7 +70,7 @@ pip install pycountry
```
bash
export
VLLM_USE_FUSED_RMS_ROPE
=
0
#
# serve启动
# serve启动
vllm serve Qwen/Qwen3-VL-Reranker-8B
\
--runner
pooling
\
--hf-overrides
'{"architectures": ["Qwen3VLForSequenceClassification"],"classifier_from_token":["no","yes"],"is_original_qwen3_reranker":true}'
\
...
...
@@ -71,14 +78,14 @@ vllm serve Qwen/Qwen3-VL-Reranker-8B \
--max-model-len
4096
\
--served-model-name
qwen3-vl-reranker
#
# client访问
# client访问
curl
-s
http://127.0.0.1:8000/rerank
\
-H
"Content-Type: application/json"
\
-d
'{
"model": "qwen3-vl-reranker",
"query": "如何部署 vLLM 的 reranker 服务?",
"documents": [
"先安装依赖,然后使用 vllm serve 启动服务,并调用 /
score
接口。",
"先安装依赖,然后使用 vllm serve 启动服务,并调用 /
rerank
接口。",
"今天的天气不错,适合出去散步。"
]
}'
...
...
@@ -92,14 +99,8 @@ curl -s http://127.0.0.1:8000/rerank \
### 精度
`DCU与GPU精度一致,推理框架:vllm`
## 预训练权重
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 |下载地址|
|:-----:|:----------:|:----------:|:---------------------:|:----------:|
| Qwen3-VL-Reranker-8B | 8B | K100AI | 1 |
[
Modelscope
](
https://www.modelscope.cn/models/Qwen/Qwen3-VL-Reranker-8B
)
|
| Qwen3-VL-Reranker-2B | 2B | K100AI | 1 |
[
Modelscope
](
https://www.modelscope.cn/models/Qwen/Qwen3-VL-Reranker-2B
)
|
## 源码仓库及问题反馈
-
https://developer.sourcefind.cn/codes/modelzoo/qwen3-vl-reranker_vllm
## 参考资料
-
https://github.com/QwenLM/Qwen3-VL-Embedding
\ No newline at end of file
-
https://github.com/QwenLM/Qwen3-VL-Embedding
requirements.txt
0 → 100644
View file @
2d334ddf
pycountry
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment