Commit 97488638 authored by chenzk's avatar chenzk
Browse files

Update url.md

parent 62de6a5c
......@@ -93,16 +93,16 @@ export VLLM_RANK7_NUMA=7
| 基座模型 | chat模型 | GPTQ模型 | AWQ模型 |
| -------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
| [Qwen2.5 3B](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-3B) | [Qwen2.5 3B Instruct](http://113.200.138.88:18080/aimodels/qwen2.5-3b-instruct) | [Qwen2.5-3B-Instruct-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/qwen2.5-3b-instruct-gptq-int4) | [Qwen2.5-3B-Instruct-AWQ](http://113.200.138.88:18080/aimodels/qwen/qwen2.5-3b-instruct-awq) |
| [Qwen2.5-7B](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-7B) | [ Qwen2.5 7B Instruct](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-7B-Instruct) | [Qwen2.5-7B-Instruct-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/qwen2.5-7b-instruct-gptq-int4) | [Qwen2.5-7B-Instruct-AWQ](http://113.200.138.88:18080/aimodels/qwen/qwen2.5-7b-instruct-awq) |
| [Qwen2.5-14B](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-14B) | [Qwen2.5-14B-Instruct](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-14B-Instruct) | [Qwen2.5-14B-Instruct-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-14B-Instruct-GPTQ-Int4) | [Qwen2.5-14B-Instruct-AWQ](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-14B-Instruct-AWQ) |
| [Qwen2.5-32B](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-32B) | [Qwen2.5-32B-Instruct](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-32B-Instruct) | [Qwen2.5-32B-Instruct-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-32B-Instruct-GPTQ-Int4) | [Qwen2.5-32B-Instruct-AWQ](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-32B-Instruct-AWQ) |
| [Qwen2.5-72B](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-72B) | [Qwen2.5-72B-Instruct](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-72B-Instruct) | [Qwen2.5-72B-Instruct-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-72B-Instruct-GPTQ-Int4) | [Qwen2.5-72B-Instruct-AWQ](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-72B-Instruct-AWQ) |
| [ Qwen2.5 Coder 1.5B](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Coder-1.5B) | [Qwen2.5-Coder-1.5B-Instruct](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Coder-1.5B-Instruct) | [Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4) | [Qwen2.5-Coder-1.5B-Instruct-AWQ](http://113.200.138.88:18080/aimodels/qwen/qwen2.5-coder-1.5b-instruct-awq) |
| [Qwen2.5 Coder 7B](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Coder-7B) | [Qwen2.5 Coder 7B Instruct](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Coder-7B-Instruct) | [Qwen2.5 Coder 7B Instruct GPTQ Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4) | [Qwen2.5 Coder 7B Instruct AWQ](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Coder-7B-Instruct-AWQ) |
| [Qwen2.5 Coder 32B](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Coder-32B) | [Qwen2.5 Coder 32B Instruct](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Coder-32B-Instruct) | [Qwen2.5 Coder 32B Instruct GPTQ Int4](https://modelscope.cn/models/Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int4) | [Qwen2.5 Coder 32B Instruct AWQ](https://modelscope.cn/models/Qwen/Qwen2.5-Coder-32B-Instruct-AWQ) |
| [Qwen2.5 Math 1.5B](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Math-1.5B) | [Qwen2.5 Math 1.5B Instruct](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Math-1.5B-Instruct) | | |
| [ Qwen2.5 Math 7B](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Math-7B) | [Qwen2.5-Math-7B-Instruct](http://113.200.138.88:18080/aimodels/qwen/Qwen2.5-Math-7B-Instruct) | | |
| [Qwen2.5 3B](https://huggingface.co/Qwen/Qwen2.5-3B) | [Qwen2.5 3B Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) | [Qwen2.5-3B-Instruct-GPTQ-Int4](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct-GPTQ-Int4) | [Qwen2.5-3B-Instruct-AWQ](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct-AWQ) |
| [Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) | [ Qwen2.5 7B Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) | [Qwen2.5-7B-Instruct-GPTQ-Int4](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4) | [Qwen2.5-7B-Instruct-AWQ](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-AWQ) |
| [Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) | [Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) | [Qwen2.5-14B-Instruct-GPTQ-Int4](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4) | [Qwen2.5-14B-Instruct-AWQ](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-AWQ) |
| [Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) | [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) | [Qwen2.5-32B-Instruct-GPTQ-Int4](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct-GPTQ-Int4) | [Qwen2.5-32B-Instruct-AWQ](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct-AWQ) |
| [Qwen2.5-72B](https://huggingface.co/Qwen/Qwen2.5-72B) | [Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct) | [Qwen2.5-72B-Instruct-GPTQ-Int4](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct-GPTQ-Int4) | [Qwen2.5-72B-Instruct-AWQ](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct-AWQ) |
| [ Qwen2.5 Coder 1.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B) | [Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct) | [Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4) | [Qwen2.5-Coder-1.5B-Instruct-AWQ](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct-AWQ) |
| [Qwen2.5 Coder 7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B) | [Qwen2.5 Coder 7B Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) | [Qwen2.5 Coder 7B Instruct GPTQ Int4](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4) | [Qwen2.5 Coder 7B Instruct AWQ](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct-AWQ) |
| [Qwen2.5 Coder 32B](https://huggingface.co/Qwen/Qwen2.5-Coder-32B) | [Qwen2.5 Coder 32B Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) | [Qwen2.5 Coder 32B Instruct GPTQ Int4](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int4) | [Qwen2.5 Coder 32B Instruct AWQ](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct-AWQ) |
| [Qwen2.5 Math 1.5B](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B) | [Qwen2.5 Math 1.5B Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B-Instruct) | | |
| [ Qwen2.5 Math 7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B) | [Qwen2.5-Math-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct) | | |
### 离线批量推理
......@@ -125,11 +125,7 @@ python benchmarks/benchmark_throughput.py --num-prompts 1 --input-len 32 --outpu
2、使用数据集
下载数据集:
```bash
wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json
wget http://113.200.138.88:18080/aidatasets/vllm_data/-/raw/main/ShareGPT_V3_unfiltered_cleaned_split.json (SCnet快速下载链接)
```
[sharegpt_v3_unfiltered_cleaned_split](https://huggingface.co/datasets/learnanything/sharegpt_v3_unfiltered_cleaned_split)
```bash
python benchmarks/benchmark_throughput.py --num-prompts 1 --model Qwen/Qwen2.5-7B-instruct --dataset ShareGPT_V3_unfiltered_cleaned_split.json -tp 1 --trust-remote-code --enforce-eager --dtype float16
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment