Commit 545d4982 authored by huangwb's avatar huangwb
Browse files

update read me

parent 7f4f25e3
......@@ -57,7 +57,7 @@ docker run -it --name llama_tgi --privileged --shm-size=64G --device=/dev/kfd -
### 部署TGI
1. 启动TGI服务
#### 1. 启动TGI服务
```
HIP_VISIBLE_DEVICES=3 text-generation-launcher --dtype=float16 --model-id /path/to/Llama-2-7b-chat-hf --port 3001
```
......@@ -65,7 +65,7 @@ HIP_VISIBLE_DEVICES=3 text-generation-launcher --dtype=float16 --model-id /path/
```
text-generation-launcher --help
```
2. 验证服务
#### 2. 请求服务
curl命令方式:
```
......@@ -74,7 +74,7 @@ curl 127.0.0.1:3001/generate \
-d '{"inputs":"What is deep learning?","parameters":{"max_new_tokens":100,"temperature":0.7}}' \
-H 'Content-Type: application/json'
```
python调用
通过python调用的方式:
```
import requests
......@@ -94,11 +94,12 @@ print(response.json())
# {'generated_text': '\n\nDeep Learning is a subset of Machine Learning that is concerned with the development of algorithms that can'}
```
更多API查看,请参考 [https://huggingface.github.io/text-generation-inference](https://huggingface.github.io/text-generation-inference)
### TGI benchmar测试
#### 3. TGI benchmark
example:
```
text-generation-benchmark -s 32 -d 128 --runs 10 --tokenizer-name /path/to/Llama-2-7b-chat-hf
```
注意:需要先启动TGI服务才能使用TGI benchmark。此外,`--tokenizer-name`需要和服务中保持一致。
更多参数可使用如下方式查看
```
text-generation-benchmark --help
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment