Unverified Commit c02e281f authored by AllentDan's avatar AllentDan Committed by GitHub
Browse files

[Doc] Update restful api doc (#662)

* update restful_api.md

* add a hint

* repeat 3 time
parent 0fcc3034
......@@ -2,11 +2,16 @@
### Launch Service
The user can open the http url print by the following command in a browser.
- **Please check the http url for the detailed api usage!!!**
- **Please check the http url for the detailed api usage!!!**
- **Please check the http url for the detailed api usage!!!**
```shell
lmdeploy serve api_server ./workspace --server_name 0.0.0.0 --server_port ${server_port} --instance_num 32 --tp 1
```
Then, the user can open the swagger UI: `http://{server_ip}:{server_port}` for the detailed api usage.
We provide four restful api in total. Three of them are in OpenAI format.
- /v1/chat/completions
......@@ -145,8 +150,8 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port
### FAQ
1. When user got `"finish_reason":"length"` which means the session is too long to be continued.
Please add `"renew_session": true` into the next request.
1. When user got `"finish_reason":"length"`, it means the session is too long to be continued. The session length can be
modified by passing `--session_len` to api_server.
2. When OOM appeared at the server side, please reduce the number of `instance_num` when lanching the service.
......
......@@ -2,13 +2,16 @@
### 启动服务
运行脚本
用户将下面命令输出的 http url 复制到浏览器打开,详细查看所有的 API 及其使用方法。
请一定查看`http://{server_ip}:{server_port}`!!!
请一定查看`http://{server_ip}:{server_port}`!!!
请一定查看`http://{server_ip}:{server_port}`!!!
重要的事情说三遍。
```shell
lmdeploy serve api_server ./workspace 0.0.0.0 --server_port ${server_port} --instance_num 32 --tp 1
```
然后用户可以打开 swagger UI: `http://{server_ip}:{server_port}` 详细查看所有的 API 及其使用方法。
我们一共提供四个 restful api,其中三个仿照 OpenAI 的形式。
- /v1/chat/completions
......@@ -142,8 +145,7 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port
### FAQ
1. 当返回结果结束原因为 `"finish_reason":"length"`,这表示回话长度超过最大值。
请添加 `"renew_session": true` 到下一次请求中。
1. 当返回结果结束原因为 `"finish_reason":"length"`,这表示回话长度超过最大值。如需调整会话支持的最大长度,可以通过启动`api_server`时,设置`--session_len`参数大小。
2. 当服务端显存 OOM 时,可以适当减小启动服务时的 `instance_num` 个数
......
......@@ -510,6 +510,9 @@ def main(model_path: str,
instance_num=instance_num,
tp=tp,
**kwargs)
for i in range(3):
print(f'HINT: Please open \033[93m\033[1mhttp://{server_name}:'
f'{server_port}\033[0m in a browser for detailed api usage!!!')
uvicorn.run(app=app, host=server_name, port=server_port, log_level='info')
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment