Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Qwen1.5_vllm
Commits
343d74b0
Commit
343d74b0
authored
Apr 16, 2025
by
chenzk
Browse files
Update url.md
parent
810a79cb
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
11 additions
and
13 deletions
+11
-13
README.md
README.md
+11
-13
No files found.
README.md
View file @
343d74b0
...
@@ -93,16 +93,16 @@ export VLLM_RANK7_NUMA=7
...
@@ -93,16 +93,16 @@ export VLLM_RANK7_NUMA=7
| 基座模型 | chat模型 | GPTQ模型 | AWQ模型 |
| 基座模型 | chat模型 | GPTQ模型 | AWQ模型 |
| --------------------------------------------------------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
| --------------------------------------------------------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
|
[
Qwen-7B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen-7B
.git
)
|
[
Qwen-7B-Chat
](
http://
113.200.138.88:18080/aimodels
/Qwen-7B-Chat
)
|
[
Qwen-7B-Chat-GPTQ-Int4
](
https://huggingface.co/Qwen/Qwen-7B-Chat-Int4
)
| |
|
[
Qwen-7B
](
http
s
://
huggingface.co/Q
wen/Qwen
1.5
-7B
)
|
[
Qwen-7B-Chat
](
http
s
://
huggingface.co/Qwen
/Qwen-7B-Chat
)
|
[
Qwen-7B-Chat-GPTQ-Int4
](
https://huggingface.co/Qwen/Qwen-7B-Chat-Int4
)
| |
|
[
Qwen-14B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen-14B
)
|
[
Qwen-14B-Chat
](
http://
113.200.138.88:18080/aimodels
/Qwen-14B-Chat
)
|
[
Qwen-14B-Chat-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen-14B-Chat-Int4
.git
)
| |
|
[
Qwen-14B
](
http
s
://
huggingface.co/Q
wen/Qwen
1.5
-14B
)
|
[
Qwen-14B-Chat
](
http
s
://
huggingface.co/Qwen
/Qwen-14B-Chat
)
|
[
Qwen-14B-Chat-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen
1.5
-14B-Chat-
GPTQ-
Int4
)
| |
|
[
Qwen-72B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen-72B
)
|
[
Qwen-72B-Chat
](
http://
113.200.138.88:18080/aimodels
/Qwen-72B-Chat
)
|
[
Qwen-72B-Chat-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen-72B-Chat-Int4
.git
)
| |
|
[
Qwen-72B
](
http
s
://
huggingface.co/Q
wen/Qwen
1.5
-72B
)
|
[
Qwen-72B-Chat
](
http
s
://
huggingface.co/Qwen
/Qwen
1.5
-72B-Chat
)
|
[
Qwen-72B-Chat-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen
1.5
-72B-Chat-
GPTQ-
Int4
)
| |
|
[
Qwen1.5-7B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-7B
.git
)
|
[
Qwen1.5-7B-Chat
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-7B-Chat
.git
)
|
[
Qwen1.5-7B-Chat-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-7B-Chat-GPTQ-Int4
.git
)
|
[
Qwen1.5-7B-Chat-AWQ
-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-7B-Chat-AWQ
)
|
|
[
Qwen1.5-7B
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-7B
)
|
[
Qwen1.5-7B-Chat
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-7B-Chat
)
|
[
Qwen1.5-7B-Chat-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-7B-Chat-GPTQ-Int4
)
|
[
Qwen1.5-7B-Chat-AWQ
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-7B-Chat-AWQ
)
|
|
[
Qwen1.5-14B
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-14B
.git
)
|
[
Qwen1.5-14B-Chat
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-14B-Chat
)
|
[
Qwen1.5-14B-Chat-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-14B-Chat-GPTQ-Int4
.git
)
|
[
Qwen1.5-14B-Chat-AWQ
-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-14B-Chat-AWQ
)
|
|
[
Qwen1.5-14B
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-14B
)
|
[
Qwen1.5-14B-Chat
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-14B-Chat
)
|
[
Qwen1.5-14B-Chat-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-14B-Chat-GPTQ-Int4
)
|
[
Qwen1.5-14B-Chat-AWQ
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-14B-Chat-AWQ
)
|
|
[
Qwen1.5-32B
](
http://
113.200.138.88:18080/aimodels
/Qwen1.5-32B
)
|
[
Qwen1.5-32B-Chat
](
http://
113.200.138.88:18080/aimodels
/Qwen1.5-32B-Chat
)
|
[
Qwen1.5-32B-Chat-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels
/Qwen1.5-32B-Chat-GPTQ-Int4
)
|
[
Qwen1.5-32B-Chat-AWQ
-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-32B-Chat-AWQ
.git
)
|
|
[
Qwen1.5-32B
](
http
s
://
huggingface.co/Qwen
/Qwen1.5-32B
)
|
[
Qwen1.5-32B-Chat
](
http
s
://
huggingface.co/Qwen
/Qwen1.5-32B-Chat
)
|
[
Qwen1.5-32B-Chat-GPTQ-Int4
](
http
s
://
huggingface.co/Qwen
/Qwen1.5-32B-Chat-GPTQ-Int4
)
|
[
Qwen1.5-32B-Chat-AWQ
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-32B-Chat-AWQ
)
|
|
[
Qwen1.5-72B
](
http://
113.200.138.88:18080/aimodels
/Qwen1.5-72B
)
|
[
Qwen1.5-72B-Chat
](
http://
113.200.138.88:18080/aimodels
/Qwen1.5-72B-Chat
)
|
[
Qwen1.5-72B-Chat-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-72B-Chat-GPTQ-Int4
.git
)
|
[
Qwen1.5-72B-Chat-AWQ
-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-72B-Chat-AWQ
)
|
|
[
Qwen1.5-72B
](
http
s
://
huggingface.co/Qwen
/Qwen1.5-72B
)
|
[
Qwen1.5-72B-Chat
](
http
s
://
huggingface.co/Qwen
/Qwen1.5-72B-Chat
)
|
[
Qwen1.5-72B-Chat-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-72B-Chat-GPTQ-Int4
)
|
[
Qwen1.5-72B-Chat-AWQ
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-72B-Chat-AWQ
)
|
|
[
Qwen1.5-110B
](
http://
113.200.138.88:18080/aimodels
/Qwen1.5-110B
)
|
[
Qwen1.5-110B-Chat
](
http://
113.200.138.88:18080/aimodels
/Qwen1.5-110B-Chat
)
|
[
Qwen1.5-110B-Chat-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-110B-Chat-GPTQ-Int4
.git
)
|
[
Qwen1.5-110B-Chat-AWQ
-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen1.5-110B-Chat-AWQ
)
|
|
[
Qwen1.5-110B
](
http
s
://
huggingface.co/Qwen
/Qwen1.5-110B
)
|
[
Qwen1.5-110B-Chat
](
http
s
://
huggingface.co/Qwen
/Qwen1.5-110B-Chat
)
|
[
Qwen1.5-110B-Chat-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-110B-Chat-GPTQ-Int4
)
|
[
Qwen1.5-110B-Chat-AWQ
](
http
s
://
huggingface.co/Q
wen/Qwen1.5-110B-Chat-AWQ
)
|
|
[
Qwen2-7B
](
http://
113.200.138.88:18080/aimodels
/Qwen2-7B
)
|
[
Qwen2-7B-Instruct
](
http://
113.200.138.88:18080/aimodels
/Qwen2-7B-Instruct
)
|
[
Qwen2-7B-Instruct-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2-7B-Instruct-GPTQ-Int4
.git
)
|
[
Qwen2-7B-Instruct-AWQ
-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2-7B-Instruct-AWQ
)
|
|
[
Qwen2-7B
](
http
s
://
huggingface.co/unsloth
/Qwen2-7B
)
|
[
Qwen2-7B-Instruct
](
http
s
://
huggingface.co/Qwen
/Qwen2-7B-Instruct
)
|
[
Qwen2-7B-Instruct-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen2-7B-Instruct-GPTQ-Int4
)
|
[
Qwen2-7B-Instruct-AWQ
](
http
s
://
huggingface.co/Q
wen/Qwen2-7B-Instruct-AWQ
)
|
|
[
Qwen2-72B
](
http://
113.200.138.88:18080/aimodels
/Qwen2-72B
)
|
[
Qwen2-72B-Instruct
](
http://
113.200.138.88:18080/aimodels
/Qwen2-72B-Instruct
)
|
[
Qwen2-72B-Instruct-GPTQ-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2-72B-Instruct-GPTQ-Int4
.git
)
|
[
Qwen2-72B-Instruct-AWQ
-Int4
](
http://
113.200.138.88:18080/aimodels/q
wen/Qwen2-72B-Instruct-AWQ
)
|
|
[
Qwen2-72B
](
http
s
://
huggingface.co/Qwen
/Qwen2-72B
)
|
[
Qwen2-72B-Instruct
](
http
s
://
huggingface.co/Qwen
/Qwen2-72B-Instruct
)
|
[
Qwen2-72B-Instruct-GPTQ-Int4
](
http
s
://
huggingface.co/Q
wen/Qwen2-72B-Instruct-GPTQ-Int4
)
|
[
Qwen2-72B-Instruct-AWQ
](
http
s
://
huggingface.co/Q
wen/Qwen2-72B-Instruct-AWQ
)
|
### 离线批量推理
### 离线批量推理
...
@@ -126,9 +126,7 @@ python benchmarks/benchmark_throughput.py --num-prompts 1 --input-len 32 --outpu
...
@@ -126,9 +126,7 @@ python benchmarks/benchmark_throughput.py --num-prompts 1 --input-len 32 --outpu
2、使用数据集
2、使用数据集
下载数据集:
下载数据集:
```
bash
[
sharegpt_v3_unfiltered_cleaned_split
](
https://huggingface.co/datasets/learnanything/sharegpt_v3_unfiltered_cleaned_split
)
wget http://113.200.138.88:18080/aidatasets/vllm_data/-/raw/main/ShareGPT_V3_unfiltered_cleaned_split.json
```
```
bash
```
bash
python benchmarks/benchmark_throughput.py
--num-prompts
1
--model
Qwen/Qwen1.5-7B-Chat
--dataset
ShareGPT_V3_unfiltered_cleaned_split.json
-tp
1
--trust-remote-code
--enforce-eager
--dtype
float16
python benchmarks/benchmark_throughput.py
--num-prompts
1
--model
Qwen/Qwen1.5-7B-Chat
--dataset
ShareGPT_V3_unfiltered_cleaned_split.json
-tp
1
--trust-remote-code
--enforce-eager
--dtype
float16
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment