Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Qwen3-Reranker_pytorch
Commits
9d6c8867
Commit
9d6c8867
authored
Jun 24, 2025
by
chenych
Browse files
Add vllm serve
parent
8011b5aa
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
32 additions
and
0 deletions
+32
-0
README.md
README.md
+32
-0
No files found.
README.md
View file @
9d6c8867
...
...
@@ -61,6 +61,7 @@ pip install transformers>=4.51.0
## 推理
### vllm推理方法
#### offline
```
bash
## 必须添加HF_ENDPOINT环境变量
export
HF_ENDPOINT
=
https://hf-mirror.com
...
...
@@ -68,6 +69,37 @@ export HF_ENDPOINT=https://hf-mirror.com
python infer_vllm.py
--model_name_or_path
/path/your_model_path/
```
#### server
官方对于qwen3发布的rerank模型增加了
`hf-overrides`
,详细原因见
[
[New Model]: Support Qwen3 Embedding & Reranker by noooop · Pull Request #19260 · vllm-project/vllm
](
https://github.com/vllm-project/vllm/pull/19260
)
1.
启动命令如下
```
bash
vllm serve Qwen/Qwen3-Reranker-0.6B
\
--host
0.0.0.0
--port
8080
--block-size
16
\
--api-key
123456
--dtype
auto
\
--trust-remote-code
\
--served-model-name
Qwen3-reranker
\
--enable-prefix-caching
\
--gpu-memory-utilization
0.9
\
--task
score
--disable-log-requests
\
--hf_overrides
'{"architectures":["Qwen3ForSequenceClassification"],"classifier_from_token": ["no", "yes"],"is_original_qwen3_reranker": true}'
```
2.
访问命令如下
```
curl -X 'POST' 'http://127.0.0.1:8080/score' \
-H 'accept: application/json' \
-H 'Authorization: Bearer 123456' \
-H 'Content-Type: application/json' \
-d '{
"model": "Qwen3-reranker",
"encoding_format": "float",
"text_1": "What is the capital of France?",
"text_2": "The capital of France is Paris."
}'
```
## result
<div
align=
center
>
<img
src=
"./doc/results-dcu.png"
/>
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment