".github/git@developer.sourcefind.cn:zhaoyu6/sglang.git" did not exist on "16cd550c8554796d2b20b39162dbab7db8355476"
Commit 1802b9a6 authored by chenych's avatar chenych
Browse files

update readme

parent 63aaaabc
...@@ -37,7 +37,7 @@ cd /your_code_path/qwen3-reranker_pytorch ...@@ -37,7 +37,7 @@ cd /your_code_path/qwen3-reranker_pytorch
### Anaconda(方法三) ### Anaconda(方法三)
关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。 关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。
```bash ```bash
DTK: 25.04 DTK: 25.04.1
python: 3.10 python: 3.10
vllm: 0.9.2+das.opt1.beta.dtk25041 vllm: 0.9.2+das.opt1.beta.dtk25041
torch: 2.5.1+das.opt1.dtk25041 torch: 2.5.1+das.opt1.dtk25041
...@@ -69,11 +69,11 @@ python infer_vllm.py --model_name_or_path /path/your_model_path/ ...@@ -69,11 +69,11 @@ python infer_vllm.py --model_name_or_path /path/your_model_path/
``` ```
#### serve #### serve
```bash
export HF_ENDPOINT=https://hf-mirror.com export HF_ENDPOINT=https://hf-mirror.com
export VLLM_USE_NN=0 export VLLM_USE_NN=0
export ALLREDUCE_STREAM_WITH_COMPUTE=1 export ALLREDUCE_STREAM_WITH_COMPUTE=1
```bash
vllm serve Qwen/Qwen3-Reranker-0.6B --max-model-len 4096 --trust-remote-code --enforce-eager --enable-prefix-caching --served-model-name Qwen3-reranker --task score --disable-log-requests --hf_overrides '{"architectures":["Qwen3ForSequenceClassification"],"classifier_from_token": ["no", "yes"],"is_original_qwen3_reranker": true}' vllm serve Qwen/Qwen3-Reranker-0.6B --max-model-len 4096 --trust-remote-code --enforce-eager --enable-prefix-caching --served-model-name Qwen3-reranker --task score --disable-log-requests --hf_overrides '{"architectures":["Qwen3ForSequenceClassification"],"classifier_from_token": ["no", "yes"],"is_original_qwen3_reranker": true}'
``` ```
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment