[Doc] Update restful api doc (#662)

* update restful_api.md * add a hint * repeat 3 time

[Doc] Update restful api doc (#662)
* update restful_api.md * add a hint * repeat 3 time
c02e281f · AllentDan · GitHub · 0fcc3034 · c02e281f · c02e281f
Unverified Commit c02e281f authored Nov 19, 2023 by AllentDan Committed by GitHub Nov 19, 2023
Showing with 17 additions and 7 deletions

docs/en/restful_api.md docs/en/restful_api.md +8 -3

docs/zh_cn/restful_api.md docs/zh_cn/restful_api.md +6 -4

lmdeploy/serve/openai/api_server.py lmdeploy/serve/openai/api_server.py +3 -0

No files found.
--- a/docs/en/restful_api.md
+++ b/docs/en/restful_api.md
@@ -2,11 +2,16 @@

 ### Launch Service

+The user can open the http url print by the following command in a browser.
+
+- **Please check the http url for the detailed api usage!!!**
+- **Please check the http url for the detailed api usage!!!**
+- **Please check the http url for the detailed api usage!!!**
+
 ```shell
 lmdeploy serve api_server ./workspace --server_name 0.0.0.0 --server_port ${server_port} --instance_num 32 --tp 1
 ```

-Then, the user can open the swagger UI: `http://{server_ip}:{server_port}` for the detailed api usage.
 We provide four restful api in total. Three of them are in OpenAI format.

 - /v1/chat/completions
@@ -145,8 +150,8 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port

 ### FAQ

-1. When user got `"finish_reason":"length"` which means the session is too long to be continued.
-   Please add `"renew_session": true` into the next request.
+1. When user got `"finish_reason":"length"`, it means the session is too long to be continued. The session length can be
+   modified by passing `--session_len` to api_server.

 2. When OOM appeared at the server side, please reduce the number of `instance_num` when lanching the service.


--- a/docs/zh_cn/restful_api.md
+++ b/docs/zh_cn/restful_api.md
@@ -2,13 +2,16 @@

 ### 启动服务

-运行脚本
+用户将下面命令输出的 http url 复制到浏览器打开，详细查看所有的 API 及其使用方法。
+请一定查看`http://{server_ip}:{server_port}`！！！
+请一定查看`http://{server_ip}:{server_port}`！！！
+请一定查看`http://{server_ip}:{server_port}`！！！
+重要的事情说三遍。

 ```shell
 lmdeploy serve api_server ./workspace 0.0.0.0 --server_port ${server_port} --instance_num 32 --tp 1
 ```

-然后用户可以打开 swagger UI: `http://{server_ip}:{server_port}` 详细查看所有的 API 及其使用方法。
 我们一共提供四个 restful api，其中三个仿照 OpenAI 的形式。

 - /v1/chat/completions
@@ -142,8 +145,7 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port

 ### FAQ

-1. 当返回结果结束原因为 `"finish_reason":"length"`，这表示回话长度超过最大值。
-   请添加 `"renew_session": true` 到下一次请求中。
+1. 当返回结果结束原因为 `"finish_reason":"length"`，这表示回话长度超过最大值。如需调整会话支持的最大长度，可以通过启动`api_server`时，设置`--session_len`参数大小。

 2. 当服务端显存 OOM 时，可以适当减小启动服务时的 `instance_num` 个数


--- a/lmdeploy/serve/openai/api_server.py
+++ b/lmdeploy/serve/openai/api_server.py
@@ -510,6 +510,9 @@ def main(model_path: str,
                                                 instance_num=instance_num,
                                                 tp=tp,
                                                 **kwargs)
+    for i in range(3):
+        print(f'HINT:    Please open \033[93m\033[1mhttp://{server_name}:'
+              f'{server_port}\033[0m in a browser for detailed api usage!!!')
    uvicorn.run(app=app, host=server_name, port=server_port, log_level='info')