No commit message

No commit message

No commit message
091528b0 · laibao · ea58ee75 · 091528b0
Commit 091528b0 authored Oct 16, 2024 by laibao
Show whitespace changes
Inline Side-by-side

Showing with 5 additions and 42 deletions

README.md README.md +5 -42

No files found.
--- a/README.md
+++ b/README.md
@@ -98,7 +98,7 @@ python examples/llava_example.py
 为了确保源码能够正常运行，还需要进行以下调整：
-* **去除了AWS CLI 下载逻辑** ：
+* **去除AWS CLI 下载逻辑** ：
 * **移除对 `subprocess` 和 `os` 模块的部分依赖**
 ### result
@@ -109,7 +109,6 @@ python examples/llava_example.py
    images:
 <div align="center">
    <img src="./doc/images.png" width="300" height="200"/>
 </div>
@@ -120,52 +119,16 @@ python examples/llava_example.py
    output:               The image features a close-up view of a stop sign on a city street
-```bash
-python benchmarks/benchmark_throughput.py --num-prompts 1 --input-len 32 --output-len 128 --model Qwen/Qwen1.5-7B-Chat -tp 1 --trust-remote-code --enforce-eager --dtype float16
-```
-其中 `--num-prompts`是batch数，`--input-len`是输入seqlen，`--output-len`是输出token长度，`--model`为模型路径，`-tp`为使用卡数，`dtype="float16"`为推理数据类型，如果模型权重是bfloat16,需要修改为float16推理。若指定 `--output-len  1`即为首字延迟。`-q gptq`为使用gptq量化模型进行推理。
-2、使用数据集
-下载数据集：
-```bash
-wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json
-```
-```bash
-python benchmarks/benchmark_throughput.py --num-prompts 1 --model Qwen/Qwen1.5-7B-Chat --dataset ShareGPT_V3_unfiltered_cleaned_split.json -tp 1 --trust-remote-code --enforce-eager --dtype float16
-```
-其中 `--num-prompts`是batch数，`--model`为模型路径，`--dataset`为使用的数据集，`-tp`为使用卡数，`dtype="float16"`为推理数据类型，如果模型权重是bfloat16,需要修改为float16推理。`-q gptq`为使用gptq量化模型进行推理。
-### api服务推理性能测试
-1、启动服务端：
-```bash
-python -m vllm.entrypoints.openai.api_server  --model Qwen/Qwen1.5-7B-Chat  --dtype float16 --enforce-eager -tp 1 
-```
-2、启动客户端：
-```bash
-python benchmarks/benchmark_serving.py --model Qwen/Qwen1.5-7B-Chat --dataset ShareGPT_V3_unfiltered_cleaned_split.json  --num-prompts 1 --trust-remote-code
-```
-参数同使用数据集，离线批量推理性能测试，具体参考[benchmarks/benchmark_serving.py](benchmarks/benchmark_serving.py)
 ### OpenAI兼容服务
 启动服务：
 ```bash
-python -m vllm.entrypoints.openai.api_server --model Qwen/Qwen1.5-7B-Chat --enforce-eager --dtype float16 --trust-remote-code
+python -m vllm.entrypoints.openai.api_server --model /llava/llava-1.5-7b-hf --image-input-type pixel_values --image-token-id 32000 --image-input-shape 1,3,336,336 --image-feature-size 576 --chat-template template_llava.jinja
 ```
-这里 `--model`为加载模型路径，`--dtype`为数据类型：float16，默认情况使用tokenizer中的预定义聊天模板，`--chat-template`可以添加新模板覆盖默认模板,`-q gptq`为使用gptq量化模型进行推理,`-q awqq`为使用awq量化模型进行推理。
+这里 `--model`为加载模型路径，`--image-input-type pixel_values`为图片输入的类型：pixel_values，`--image-token-id`用于指定图片输入的特殊标记 ID，`--image-input-shape`设置图片输入的形状,`--image-feature-size`指定图像特征的大小，`--chat-template`可以添加新模板覆盖默认模板。
 列出模型型号：