Commit 343d74b0 authored by chenzk's avatar chenzk
Browse files

Update url.md

parent 810a79cb
...@@ -93,16 +93,16 @@ export VLLM_RANK7_NUMA=7 ...@@ -93,16 +93,16 @@ export VLLM_RANK7_NUMA=7
| 基座模型 | chat模型 | GPTQ模型 | AWQ模型 | | 基座模型 | chat模型 | GPTQ模型 | AWQ模型 |
| --------------------------------------------------------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- | | --------------------------------------------------------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
| [Qwen-7B](http://113.200.138.88:18080/aimodels/qwen/Qwen-7B.git) | [Qwen-7B-Chat](http://113.200.138.88:18080/aimodels/Qwen-7B-Chat) | [Qwen-7B-Chat-GPTQ-Int4](https://huggingface.co/Qwen/Qwen-7B-Chat-Int4) | | | [Qwen-7B](https://huggingface.co/Qwen/Qwen1.5-7B) | [Qwen-7B-Chat](https://huggingface.co/Qwen/Qwen-7B-Chat) | [Qwen-7B-Chat-GPTQ-Int4](https://huggingface.co/Qwen/Qwen-7B-Chat-Int4) | |
| [Qwen-14B](http://113.200.138.88:18080/aimodels/qwen/Qwen-14B) | [Qwen-14B-Chat](http://113.200.138.88:18080/aimodels/Qwen-14B-Chat) | [Qwen-14B-Chat-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen-14B-Chat-Int4.git) | | | [Qwen-14B](https://huggingface.co/Qwen/Qwen1.5-14B) | [Qwen-14B-Chat](https://huggingface.co/Qwen/Qwen-14B-Chat) | [Qwen-14B-Chat-GPTQ-Int4](https://huggingface.co/Qwen/Qwen1.5-14B-Chat-GPTQ-Int4) | |
| [Qwen-72B](http://113.200.138.88:18080/aimodels/qwen/Qwen-72B) | [Qwen-72B-Chat](http://113.200.138.88:18080/aimodels/Qwen-72B-Chat) | [Qwen-72B-Chat-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen-72B-Chat-Int4.git) | | | [Qwen-72B](https://huggingface.co/Qwen/Qwen1.5-72B) | [Qwen-72B-Chat](https://huggingface.co/Qwen/Qwen1.5-72B-Chat) | [Qwen-72B-Chat-GPTQ-Int4](https://huggingface.co/Qwen/Qwen1.5-72B-Chat-GPTQ-Int4) | |
| [Qwen1.5-7B](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-7B.git) | [Qwen1.5-7B-Chat](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-7B-Chat.git) | [Qwen1.5-7B-Chat-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-7B-Chat-GPTQ-Int4.git) | [Qwen1.5-7B-Chat-AWQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-7B-Chat-AWQ) | | [Qwen1.5-7B](https://huggingface.co/Qwen/Qwen1.5-7B) | [Qwen1.5-7B-Chat](https://huggingface.co/Qwen/Qwen1.5-7B-Chat) | [Qwen1.5-7B-Chat-GPTQ-Int4](https://huggingface.co/Qwen/Qwen1.5-7B-Chat-GPTQ-Int4) | [Qwen1.5-7B-Chat-AWQ](https://huggingface.co/Qwen/Qwen1.5-7B-Chat-AWQ) |
| [Qwen1.5-14B](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-14B.git) | [Qwen1.5-14B-Chat](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-14B-Chat) | [Qwen1.5-14B-Chat-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-14B-Chat-GPTQ-Int4.git) | [Qwen1.5-14B-Chat-AWQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-14B-Chat-AWQ) | | [Qwen1.5-14B](https://huggingface.co/Qwen/Qwen1.5-14B) | [Qwen1.5-14B-Chat](https://huggingface.co/Qwen/Qwen1.5-14B-Chat) | [Qwen1.5-14B-Chat-GPTQ-Int4](https://huggingface.co/Qwen/Qwen1.5-14B-Chat-GPTQ-Int4) | [Qwen1.5-14B-Chat-AWQ](https://huggingface.co/Qwen/Qwen1.5-14B-Chat-AWQ) |
| [Qwen1.5-32B](http://113.200.138.88:18080/aimodels/Qwen1.5-32B) | [Qwen1.5-32B-Chat](http://113.200.138.88:18080/aimodels/Qwen1.5-32B-Chat) | [Qwen1.5-32B-Chat-GPTQ-Int4](http://113.200.138.88:18080/aimodels/Qwen1.5-32B-Chat-GPTQ-Int4) | [Qwen1.5-32B-Chat-AWQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-32B-Chat-AWQ.git) | | [Qwen1.5-32B](https://huggingface.co/Qwen/Qwen1.5-32B) | [Qwen1.5-32B-Chat](https://huggingface.co/Qwen/Qwen1.5-32B-Chat) | [Qwen1.5-32B-Chat-GPTQ-Int4](https://huggingface.co/Qwen/Qwen1.5-32B-Chat-GPTQ-Int4) | [Qwen1.5-32B-Chat-AWQ](https://huggingface.co/Qwen/Qwen1.5-32B-Chat-AWQ) |
| [Qwen1.5-72B](http://113.200.138.88:18080/aimodels/Qwen1.5-72B) | [Qwen1.5-72B-Chat](http://113.200.138.88:18080/aimodels/Qwen1.5-72B-Chat) | [Qwen1.5-72B-Chat-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-72B-Chat-GPTQ-Int4.git) | [Qwen1.5-72B-Chat-AWQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-72B-Chat-AWQ) | | [Qwen1.5-72B](https://huggingface.co/Qwen/Qwen1.5-72B) | [Qwen1.5-72B-Chat](https://huggingface.co/Qwen/Qwen1.5-72B-Chat) | [Qwen1.5-72B-Chat-GPTQ-Int4](https://huggingface.co/Qwen/Qwen1.5-72B-Chat-GPTQ-Int4) | [Qwen1.5-72B-Chat-AWQ](https://huggingface.co/Qwen/Qwen1.5-72B-Chat-AWQ) |
| [Qwen1.5-110B](http://113.200.138.88:18080/aimodels/Qwen1.5-110B) | [Qwen1.5-110B-Chat](http://113.200.138.88:18080/aimodels/Qwen1.5-110B-Chat) | [Qwen1.5-110B-Chat-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-110B-Chat-GPTQ-Int4.git) | [Qwen1.5-110B-Chat-AWQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen1.5-110B-Chat-AWQ) | | [Qwen1.5-110B](https://huggingface.co/Qwen/Qwen1.5-110B) | [Qwen1.5-110B-Chat](https://huggingface.co/Qwen/Qwen1.5-110B-Chat) | [Qwen1.5-110B-Chat-GPTQ-Int4](https://huggingface.co/Qwen/Qwen1.5-110B-Chat-GPTQ-Int4) | [Qwen1.5-110B-Chat-AWQ](https://huggingface.co/Qwen/Qwen1.5-110B-Chat-AWQ) |
| [Qwen2-7B](http://113.200.138.88:18080/aimodels/Qwen2-7B) | [Qwen2-7B-Instruct](http://113.200.138.88:18080/aimodels/Qwen2-7B-Instruct) | [Qwen2-7B-Instruct-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen2-7B-Instruct-GPTQ-Int4.git) | [Qwen2-7B-Instruct-AWQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen2-7B-Instruct-AWQ) | | [Qwen2-7B](https://huggingface.co/unsloth/Qwen2-7B) | [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) | [Qwen2-7B-Instruct-GPTQ-Int4](https://huggingface.co/Qwen/Qwen2-7B-Instruct-GPTQ-Int4) | [Qwen2-7B-Instruct-AWQ](https://huggingface.co/Qwen/Qwen2-7B-Instruct-AWQ) |
| [Qwen2-72B](http://113.200.138.88:18080/aimodels/Qwen2-72B) | [Qwen2-72B-Instruct](http://113.200.138.88:18080/aimodels/Qwen2-72B-Instruct) | [Qwen2-72B-Instruct-GPTQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen2-72B-Instruct-GPTQ-Int4.git) | [Qwen2-72B-Instruct-AWQ-Int4](http://113.200.138.88:18080/aimodels/qwen/Qwen2-72B-Instruct-AWQ) | | [Qwen2-72B](https://huggingface.co/Qwen/Qwen2-72B) | [Qwen2-72B-Instruct](https://huggingface.co/Qwen/Qwen2-72B-Instruct) | [Qwen2-72B-Instruct-GPTQ-Int4](https://huggingface.co/Qwen/Qwen2-72B-Instruct-GPTQ-Int4) | [Qwen2-72B-Instruct-AWQ](https://huggingface.co/Qwen/Qwen2-72B-Instruct-AWQ) |
### 离线批量推理 ### 离线批量推理
...@@ -126,9 +126,7 @@ python benchmarks/benchmark_throughput.py --num-prompts 1 --input-len 32 --outpu ...@@ -126,9 +126,7 @@ python benchmarks/benchmark_throughput.py --num-prompts 1 --input-len 32 --outpu
2、使用数据集 2、使用数据集
下载数据集: 下载数据集:
```bash [sharegpt_v3_unfiltered_cleaned_split](https://huggingface.co/datasets/learnanything/sharegpt_v3_unfiltered_cleaned_split)
wget http://113.200.138.88:18080/aidatasets/vllm_data/-/raw/main/ShareGPT_V3_unfiltered_cleaned_split.json
```
```bash ```bash
python benchmarks/benchmark_throughput.py --num-prompts 1 --model Qwen/Qwen1.5-7B-Chat --dataset ShareGPT_V3_unfiltered_cleaned_split.json -tp 1 --trust-remote-code --enforce-eager --dtype float16 python benchmarks/benchmark_throughput.py --num-prompts 1 --model Qwen/Qwen1.5-7B-Chat --dataset ShareGPT_V3_unfiltered_cleaned_split.json -tp 1 --trust-remote-code --enforce-eager --dtype float16
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment