Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Lmdeploy
Commits
c02e281f
Unverified
Commit
c02e281f
authored
Nov 19, 2023
by
AllentDan
Committed by
GitHub
Nov 19, 2023
Browse files
[Doc] Update restful api doc (#662)
* update restful_api.md * add a hint * repeat 3 time
parent
0fcc3034
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
17 additions
and
7 deletions
+17
-7
docs/en/restful_api.md
docs/en/restful_api.md
+8
-3
docs/zh_cn/restful_api.md
docs/zh_cn/restful_api.md
+6
-4
lmdeploy/serve/openai/api_server.py
lmdeploy/serve/openai/api_server.py
+3
-0
No files found.
docs/en/restful_api.md
View file @
c02e281f
...
...
@@ -2,11 +2,16 @@
### Launch Service
The user can open the http url print by the following command in a browser.
-
**Please check the http url for the detailed api usage!!!**
-
**Please check the http url for the detailed api usage!!!**
-
**Please check the http url for the detailed api usage!!!**
```
shell
lmdeploy serve api_server ./workspace
--server_name
0.0.0.0
--server_port
${
server_port
}
--instance_num
32
--tp
1
```
Then, the user can open the swagger UI:
`http://{server_ip}:{server_port}`
for the detailed api usage.
We provide four restful api in total. Three of them are in OpenAI format.
-
/v1/chat/completions
...
...
@@ -145,8 +150,8 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port
### FAQ
1.
When user got
`"finish_reason":"length"`
which
means the session is too long to be continued.
Please add
`"renew_session": true`
into the next request
.
1.
When user got
`"finish_reason":"length"`
, it
means the session is too long to be continued.
The session length can be
modified by passing
`--session_len`
to api_server
.
2.
When OOM appeared at the server side, please reduce the number of
`instance_num`
when lanching the service.
...
...
docs/zh_cn/restful_api.md
View file @
c02e281f
...
...
@@ -2,13 +2,16 @@
### 启动服务
运行脚本
用户将下面命令输出的 http url 复制到浏览器打开,详细查看所有的 API 及其使用方法。
请一定查看
`http://{server_ip}:{server_port}`
!!!
请一定查看
`http://{server_ip}:{server_port}`
!!!
请一定查看
`http://{server_ip}:{server_port}`
!!!
重要的事情说三遍。
```
shell
lmdeploy serve api_server ./workspace 0.0.0.0
--server_port
${
server_port
}
--instance_num
32
--tp
1
```
然后用户可以打开 swagger UI:
`http://{server_ip}:{server_port}`
详细查看所有的 API 及其使用方法。
我们一共提供四个 restful api,其中三个仿照 OpenAI 的形式。
-
/v1/chat/completions
...
...
@@ -142,8 +145,7 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port
### FAQ
1.
当返回结果结束原因为
`"finish_reason":"length"`
,这表示回话长度超过最大值。
请添加
`"renew_session": true`
到下一次请求中。
1.
当返回结果结束原因为
`"finish_reason":"length"`
,这表示回话长度超过最大值。如需调整会话支持的最大长度,可以通过启动
`api_server`
时,设置
`--session_len`
参数大小。
2.
当服务端显存 OOM 时,可以适当减小启动服务时的
`instance_num`
个数
...
...
lmdeploy/serve/openai/api_server.py
View file @
c02e281f
...
...
@@ -510,6 +510,9 @@ def main(model_path: str,
instance_num
=
instance_num
,
tp
=
tp
,
**
kwargs
)
for
i
in
range
(
3
):
print
(
f
'HINT: Please open
\033
[93m
\033
[1mhttp://
{
server_name
}
:'
f
'
{
server_port
}
\033
[0m in a browser for detailed api usage!!!'
)
uvicorn
.
run
(
app
=
app
,
host
=
server_name
,
port
=
server_port
,
log_level
=
'info'
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment