完善readme

7ccc04b8 · zhouxiang · eafad7d3 · 7ccc04b8
Commit 7ccc04b8 authored Jan 31, 2024 by zhouxiang
Hide whitespace changes
Inline Side-by-side

Showing with 32 additions and 1 deletion

README.md README.md +32 -1

No files found.
--- a/README.md
+++ b/README.md
@@ -41,7 +41,6 @@ docker run -it --name qwen --shm-size=1024G  --device=/dev/kfd --device=/dev/dri
 ### 源码编译安装
 ```
 # 若使用光源的镜像，可以不用源码编译，镜像里面安装好了lmdeploy，可跳过源码编译安装
-pip uninstall lmdeploy
 # 获取源码，编译并安装
 git clone http://developer.hpccube.com/codes/modelzoo/Qwen_lmdeploy.git
 cd qwen_lmdeploy
@@ -124,6 +123,38 @@ lmdeploy serve gradio --model_path_or_server ./workspace_qwen72b --server_name {
 在网页上输入{ip}:{pord}即可进行对话
 ```
+### api-server方式运行实例
+启动server：
+```shell
+# --instance_num: turbomind推理实例的个数。可理解为支持的最大并发数
+# --tp: 在 tensor parallel时，使用的GPU数量
+lmdeploy serve api_server ./workspace_qwen72b --server_name ${server_ip} --server_port ${server_port} --tp 8
+```
+浏览器上打开 `http://{server_ip}:{server_port}`，即可访问 swagger，查阅 RESTful API 的详细信息。
+可以用命令行，在控制台与 server 通信（在新启的命令行页面下执行）：
+```shell
+# restful_api_url 就是 api_server 产生的，即上述启动server的http://{server_ip}:{server_port}
+lmdeploy serve api_client restful_api_url
+```
+或者，启动 gradio，在 webui 的聊天对话框中，与服务交流：
+```shell
+# restful_api_url 就是 api_server 产生的，比如 http://localhost:23333
+# server_ip 和 server_port 是用来提供 gradio ui 访问服务的
+# 例子: lmdeploy serve gradio http://localhost:23333 --server_name localhost --server_port 6006 --restful_api True
+lmdeploy serve gradio restful_api_url --server_name ${server_ip} --server_port ${server_port} --restful_api True
+```
+**需要保证'{server_ip}:{server_port}'在外部浏览器中的可访问性**
+关于 RESTful API的详细介绍，请参考[这份]([docs/zh_cn/restful_api.md · dtk23.10-v0.1.0 · AIComponent / Lmdeploy · GitLab (hpccube.com)](https://developer.hpccube.com/codes/aicomponent/lmdeploy/-/blob/dtk23.10-v0.1.0/docs/zh_cn/restful_api.md))文档。
 ## result
 ![qwen推理](docs/qwen推理.gif)