Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Qwen2.5_vllm
Commits
97488638
Commit
97488638
authored
Apr 16, 2025
by
chenzk
Browse files
Update url.md
parent
62de6a5c
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
11 additions
and
15 deletions
+11
-15
README.md
README.md
+11
-15
No files found.
README.md
View file @
97488638
...
@@ -93,16 +93,16 @@ export VLLM_RANK7_NUMA=7
...
@@ -93,16 +93,16 @@ export VLLM_RANK7_NUMA=7
| 基座模型 | chat模型 | GPTQ模型 | AWQ模型 |
| 基座模型 | chat模型 | GPTQ模型 | AWQ模型 |
| -------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
| -------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
|
[
Qwen2.5 3B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-3B
)
|
[
Qwen2.5 3B Instruct
](
http://
113.200.138.88:18080/aimodels/q
wen2.5-3
b-i
nstruct
)
|
[
Qwen2.5-3B-Instruct-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/
q
wen2.5-3
b-i
nstruct-
gptq-i
nt4
)
|
[
Qwen2.5-3B-Instruct-AWQ
](
http://
113.200.138.88:18080/aimodels/q
wen/
q
wen2.5-3
b-i
nstruct-
awq
)
|
|
[
Qwen2.5 3B
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-3B
)
|
[
Qwen2.5 3B Instruct
](
http
s
://
huggingface.co/Qwen/Q
wen2.5-3
B-I
nstruct
)
|
[
Qwen2.5-3B-Instruct-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/
Q
wen2.5-3
B-I
nstruct-
GPTQ-I
nt4
)
|
[
Qwen2.5-3B-Instruct-AWQ
](
http
s
://
huggingface.co/Q
wen/
Q
wen2.5-3
B-I
nstruct-
AWQ
)
|
|
[
Qwen2.5-7B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-7B
)
|
[
Qwen2.5 7B Instruct
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-7B-Instruct
)
|
[
Qwen2.5-7B-Instruct-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/
q
wen2.5-7
b-i
nstruct-
gptq-i
nt4
)
|
[
Qwen2.5-7B-Instruct-AWQ
](
http://
113.200.138.88:18080/aimodels/q
wen/
q
wen2.5-7
b-i
nstruct-
awq
)
|
|
[
Qwen2.5-7B
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-7B
)
|
[
Qwen2.5 7B Instruct
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-7B-Instruct
)
|
[
Qwen2.5-7B-Instruct-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/
Q
wen2.5-7
B-I
nstruct-
GPTQ-I
nt4
)
|
[
Qwen2.5-7B-Instruct-AWQ
](
http
s
://
huggingface.co/Q
wen/
Q
wen2.5-7
B-I
nstruct-
AWQ
)
|
|
[
Qwen2.5-14B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-14B
)
|
[
Qwen2.5-14B-Instruct
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-14B-Instruct
)
|
[
Qwen2.5-14B-Instruct-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-14B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5-14B-Instruct-AWQ
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-14B-Instruct-AWQ
)
|
|
[
Qwen2.5-14B
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-14B
)
|
[
Qwen2.5-14B-Instruct
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-14B-Instruct
)
|
[
Qwen2.5-14B-Instruct-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-14B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5-14B-Instruct-AWQ
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-14B-Instruct-AWQ
)
|
|
[
Qwen2.5-32B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-32B
)
|
[
Qwen2.5-32B-Instruct
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-32B-Instruct
)
|
[
Qwen2.5-32B-Instruct-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-32B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5-32B-Instruct-AWQ
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-32B-Instruct-AWQ
)
|
|
[
Qwen2.5-32B
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-32B
)
|
[
Qwen2.5-32B-Instruct
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-32B-Instruct
)
|
[
Qwen2.5-32B-Instruct-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-32B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5-32B-Instruct-AWQ
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-32B-Instruct-AWQ
)
|
|
[
Qwen2.5-72B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-72B
)
|
[
Qwen2.5-72B-Instruct
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-72B-Instruct
)
|
[
Qwen2.5-72B-Instruct-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-72B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5-72B-Instruct-AWQ
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-72B-Instruct-AWQ
)
|
|
[
Qwen2.5-72B
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-72B
)
|
[
Qwen2.5-72B-Instruct
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-72B-Instruct
)
|
[
Qwen2.5-72B-Instruct-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-72B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5-72B-Instruct-AWQ
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-72B-Instruct-AWQ
)
|
|
[
Qwen2.5 Coder 1.5B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Coder-1.5B
)
|
[
Qwen2.5-Coder-1.5B-Instruct
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Coder-1.5B-Instruct
)
|
[
Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5-Coder-1.5B-Instruct-AWQ
](
http://
113.200.138.88:18080/aimodels/q
wen/
q
wen2.5-
c
oder-1.5
b-i
nstruct-
awq
)
|
|
[
Qwen2.5 Coder 1.5B
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Coder-1.5B
)
|
[
Qwen2.5-Coder-1.5B-Instruct
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Coder-1.5B-Instruct
)
|
[
Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5-Coder-1.5B-Instruct-AWQ
](
http
s
://
huggingface.co/Q
wen/
Q
wen2.5-
C
oder-1.5
B-I
nstruct-
AWQ
)
|
|
[
Qwen2.5 Coder 7B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Coder-7B
)
|
[
Qwen2.5 Coder 7B Instruct
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Coder-7B-Instruct
)
|
[
Qwen2.5 Coder 7B Instruct GPTQ Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5 Coder 7B Instruct AWQ
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Coder-7B-Instruct-AWQ
)
|
|
[
Qwen2.5 Coder 7B
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Coder-7B
)
|
[
Qwen2.5 Coder 7B Instruct
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Coder-7B-Instruct
)
|
[
Qwen2.5 Coder 7B Instruct GPTQ Int4
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5 Coder 7B Instruct AWQ
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Coder-7B-Instruct-AWQ
)
|
|
[
Qwen2.5 Coder 32B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Coder-32B
)
|
[
Qwen2.5 Coder 32B Instruct
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Coder-32B-Instruct
)
|
[
Qwen2.5 Coder 32B Instruct GPTQ Int4
](
https://
modelscope.cn/models
/Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5 Coder 32B Instruct AWQ
](
https://
modelscope.cn/models
/Qwen/Qwen2.5-Coder-32B-Instruct-AWQ
)
|
|
[
Qwen2.5 Coder 32B
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Coder-32B
)
|
[
Qwen2.5 Coder 32B Instruct
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Coder-32B-Instruct
)
|
[
Qwen2.5 Coder 32B Instruct GPTQ Int4
](
https://
huggingface.co
/Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int4
)
|
[
Qwen2.5 Coder 32B Instruct AWQ
](
https://
huggingface.co
/Qwen/Qwen2.5-Coder-32B-Instruct-AWQ
)
|
|
[
Qwen2.5 Math 1.5B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Math-1.5B
)
|
[
Qwen2.5 Math 1.5B Instruct
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Math-1.5B-Instruct
)
| | |
|
[
Qwen2.5 Math 1.5B
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Math-1.5B
)
|
[
Qwen2.5 Math 1.5B Instruct
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Math-1.5B-Instruct
)
| | |
|
[
Qwen2.5 Math 7B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Math-7B
)
|
[
Qwen2.5-Math-7B-Instruct
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2.5-Math-7B-Instruct
)
| | |
|
[
Qwen2.5 Math 7B
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Math-7B
)
|
[
Qwen2.5-Math-7B-Instruct
](
http
s
://
huggingface.co/Q
wen/Qwen2.5-Math-7B-Instruct
)
| | |
### 离线批量推理
### 离线批量推理
...
@@ -125,11 +125,7 @@ python benchmarks/benchmark_throughput.py --num-prompts 1 --input-len 32 --outpu
...
@@ -125,11 +125,7 @@ python benchmarks/benchmark_throughput.py --num-prompts 1 --input-len 32 --outpu
2、使用数据集
2、使用数据集
下载数据集:
下载数据集:
[
sharegpt_v3_unfiltered_cleaned_split
](
https://huggingface.co/datasets/learnanything/sharegpt_v3_unfiltered_cleaned_split
)
```
bash
wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json
wget http://113.200.138.88:18080/aidatasets/vllm_data/-/raw/main/ShareGPT_V3_unfiltered_cleaned_split.json
(
SCnet快速下载链接
)
```
```
bash
```
bash
python benchmarks/benchmark_throughput.py
--num-prompts
1
--model
Qwen/Qwen2.5-7B-instruct
--dataset
ShareGPT_V3_unfiltered_cleaned_split.json
-tp
1
--trust-remote-code
--enforce-eager
--dtype
float16
python benchmarks/benchmark_throughput.py
--num-prompts
1
--model
Qwen/Qwen2.5-7B-instruct
--dataset
ShareGPT_V3_unfiltered_cleaned_split.json
-tp
1
--trust-remote-code
--enforce-eager
--dtype
float16
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment