update read me

545d4982 · huangwb · 7f4f25e3 · 545d4982
Commit 545d4982 authored Jun 03, 2024 by huangwb
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 4 deletions

README.md README.md +5 -4

No files found.
--- a/README.md
+++ b/README.md
@@ -57,7 +57,7 @@ docker run -it --name llama_tgi --privileged --shm-size=64G  --device=/dev/kfd -
 ### 部署TGI
-1. 启动TGI服务端
+#### 1. 启动TGI服务
 ```
 HIP_VISIBLE_DEVICES=3 text-generation-launcher --dtype=float16 --model-id /path/to/Llama-2-7b-chat-hf --port 3001
 ```
@@ -65,7 +65,7 @@ HIP_VISIBLE_DEVICES=3 text-generation-launcher --dtype=float16 --model-id /path/
 ```
 text-generation-launcher --help
 ```
-2. 验证服务
+#### 2. 请求服务
 curl命令方式:
 ```
@@ -74,7 +74,7 @@ curl 127.0.0.1:3001/generate \
    -d '{"inputs":"What is deep learning?","parameters":{"max_new_tokens":100,"temperature":0.7}}' \
    -H 'Content-Type: application/json'
 ```
-python里调用
+通过python调用的方式：
 ```
 import requests
@@ -94,11 +94,12 @@ print(response.json())
 # {'generated_text': '\n\nDeep Learning is a subset of Machine Learning that is concerned with the development of algorithms that can'}
 ```
 更多API查看，请参考 [https://huggingface.github.io/text-generation-inference](https://huggingface.github.io/text-generation-inference)
-### TGI benchmar测试
+#### 3. TGI benchmark
 example:
 ```
 text-generation-benchmark -s 32 -d 128 --runs 10 --tokenizer-name /path/to/Llama-2-7b-chat-hf
 ```
+注意：需要先启动TGI服务才能使用TGI benchmark。此外，`--tokenizer-name`需要和服务中保持一致。
 更多参数可使用如下方式查看
 ```
 text-generation-benchmark --help