Commit 545d4982 authored by huangwb's avatar huangwb
Browse files

update read me

parent 7f4f25e3
...@@ -57,7 +57,7 @@ docker run -it --name llama_tgi --privileged --shm-size=64G --device=/dev/kfd - ...@@ -57,7 +57,7 @@ docker run -it --name llama_tgi --privileged --shm-size=64G --device=/dev/kfd -
### 部署TGI ### 部署TGI
1. 启动TGI服务 #### 1. 启动TGI服务
``` ```
HIP_VISIBLE_DEVICES=3 text-generation-launcher --dtype=float16 --model-id /path/to/Llama-2-7b-chat-hf --port 3001 HIP_VISIBLE_DEVICES=3 text-generation-launcher --dtype=float16 --model-id /path/to/Llama-2-7b-chat-hf --port 3001
``` ```
...@@ -65,7 +65,7 @@ HIP_VISIBLE_DEVICES=3 text-generation-launcher --dtype=float16 --model-id /path/ ...@@ -65,7 +65,7 @@ HIP_VISIBLE_DEVICES=3 text-generation-launcher --dtype=float16 --model-id /path/
``` ```
text-generation-launcher --help text-generation-launcher --help
``` ```
2. 验证服务 #### 2. 请求服务
curl命令方式: curl命令方式:
``` ```
...@@ -74,7 +74,7 @@ curl 127.0.0.1:3001/generate \ ...@@ -74,7 +74,7 @@ curl 127.0.0.1:3001/generate \
-d '{"inputs":"What is deep learning?","parameters":{"max_new_tokens":100,"temperature":0.7}}' \ -d '{"inputs":"What is deep learning?","parameters":{"max_new_tokens":100,"temperature":0.7}}' \
-H 'Content-Type: application/json' -H 'Content-Type: application/json'
``` ```
python调用 通过python调用的方式:
``` ```
import requests import requests
...@@ -94,11 +94,12 @@ print(response.json()) ...@@ -94,11 +94,12 @@ print(response.json())
# {'generated_text': '\n\nDeep Learning is a subset of Machine Learning that is concerned with the development of algorithms that can'} # {'generated_text': '\n\nDeep Learning is a subset of Machine Learning that is concerned with the development of algorithms that can'}
``` ```
更多API查看,请参考 [https://huggingface.github.io/text-generation-inference](https://huggingface.github.io/text-generation-inference) 更多API查看,请参考 [https://huggingface.github.io/text-generation-inference](https://huggingface.github.io/text-generation-inference)
### TGI benchmar测试 #### 3. TGI benchmark
example: example:
``` ```
text-generation-benchmark -s 32 -d 128 --runs 10 --tokenizer-name /path/to/Llama-2-7b-chat-hf text-generation-benchmark -s 32 -d 128 --runs 10 --tokenizer-name /path/to/Llama-2-7b-chat-hf
``` ```
注意:需要先启动TGI服务才能使用TGI benchmark。此外,`--tokenizer-name`需要和服务中保持一致。
更多参数可使用如下方式查看 更多参数可使用如下方式查看
``` ```
text-generation-benchmark --help text-generation-benchmark --help
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment