Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Lmdeploy
Commits
c02e281f
Unverified
Commit
c02e281f
authored
Nov 19, 2023
by
AllentDan
Committed by
GitHub
Nov 19, 2023
Browse files
[Doc] Update restful api doc (#662)
* update restful_api.md * add a hint * repeat 3 time
parent
0fcc3034
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
17 additions
and
7 deletions
+17
-7
docs/en/restful_api.md
docs/en/restful_api.md
+8
-3
docs/zh_cn/restful_api.md
docs/zh_cn/restful_api.md
+6
-4
lmdeploy/serve/openai/api_server.py
lmdeploy/serve/openai/api_server.py
+3
-0
No files found.
docs/en/restful_api.md
View file @
c02e281f
...
@@ -2,11 +2,16 @@
...
@@ -2,11 +2,16 @@
### Launch Service
### Launch Service
The user can open the http url print by the following command in a browser.
-
**Please check the http url for the detailed api usage!!!**
-
**Please check the http url for the detailed api usage!!!**
-
**Please check the http url for the detailed api usage!!!**
```
shell
```
shell
lmdeploy serve api_server ./workspace
--server_name
0.0.0.0
--server_port
${
server_port
}
--instance_num
32
--tp
1
lmdeploy serve api_server ./workspace
--server_name
0.0.0.0
--server_port
${
server_port
}
--instance_num
32
--tp
1
```
```
Then, the user can open the swagger UI:
`http://{server_ip}:{server_port}`
for the detailed api usage.
We provide four restful api in total. Three of them are in OpenAI format.
We provide four restful api in total. Three of them are in OpenAI format.
-
/v1/chat/completions
-
/v1/chat/completions
...
@@ -145,8 +150,8 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port
...
@@ -145,8 +150,8 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port
### FAQ
### FAQ
1.
When user got
`"finish_reason":"length"`
which
means the session is too long to be continued.
1.
When user got
`"finish_reason":"length"`
, it
means the session is too long to be continued.
The session length can be
Please add
`"renew_session": true`
into the next request
.
modified by passing
`--session_len`
to api_server
.
2.
When OOM appeared at the server side, please reduce the number of
`instance_num`
when lanching the service.
2.
When OOM appeared at the server side, please reduce the number of
`instance_num`
when lanching the service.
...
...
docs/zh_cn/restful_api.md
View file @
c02e281f
...
@@ -2,13 +2,16 @@
...
@@ -2,13 +2,16 @@
### 启动服务
### 启动服务
运行脚本
用户将下面命令输出的 http url 复制到浏览器打开,详细查看所有的 API 及其使用方法。
请一定查看
`http://{server_ip}:{server_port}`
!!!
请一定查看
`http://{server_ip}:{server_port}`
!!!
请一定查看
`http://{server_ip}:{server_port}`
!!!
重要的事情说三遍。
```
shell
```
shell
lmdeploy serve api_server ./workspace 0.0.0.0
--server_port
${
server_port
}
--instance_num
32
--tp
1
lmdeploy serve api_server ./workspace 0.0.0.0
--server_port
${
server_port
}
--instance_num
32
--tp
1
```
```
然后用户可以打开 swagger UI:
`http://{server_ip}:{server_port}`
详细查看所有的 API 及其使用方法。
我们一共提供四个 restful api,其中三个仿照 OpenAI 的形式。
我们一共提供四个 restful api,其中三个仿照 OpenAI 的形式。
-
/v1/chat/completions
-
/v1/chat/completions
...
@@ -142,8 +145,7 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port
...
@@ -142,8 +145,7 @@ lmdeploy serve gradio api_server_url --server_name ${gradio_ui_ip} --server_port
### FAQ
### FAQ
1.
当返回结果结束原因为
`"finish_reason":"length"`
,这表示回话长度超过最大值。
1.
当返回结果结束原因为
`"finish_reason":"length"`
,这表示回话长度超过最大值。如需调整会话支持的最大长度,可以通过启动
`api_server`
时,设置
`--session_len`
参数大小。
请添加
`"renew_session": true`
到下一次请求中。
2.
当服务端显存 OOM 时,可以适当减小启动服务时的
`instance_num`
个数
2.
当服务端显存 OOM 时,可以适当减小启动服务时的
`instance_num`
个数
...
...
lmdeploy/serve/openai/api_server.py
View file @
c02e281f
...
@@ -510,6 +510,9 @@ def main(model_path: str,
...
@@ -510,6 +510,9 @@ def main(model_path: str,
instance_num
=
instance_num
,
instance_num
=
instance_num
,
tp
=
tp
,
tp
=
tp
,
**
kwargs
)
**
kwargs
)
for
i
in
range
(
3
):
print
(
f
'HINT: Please open
\033
[93m
\033
[1mhttp://
{
server_name
}
:'
f
'
{
server_port
}
\033
[0m in a browser for detailed api usage!!!'
)
uvicorn
.
run
(
app
=
app
,
host
=
server_name
,
port
=
server_port
,
log_level
=
'info'
)
uvicorn
.
run
(
app
=
app
,
host
=
server_name
,
port
=
server_port
,
log_level
=
'info'
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment