Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
CodeLlama_lmdeploy
Commits
2c4a3749
Commit
2c4a3749
authored
Aug 28, 2024
by
xuxzh1
🎱
Browse files
update
parent
9bf1d303
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
11 additions
and
19 deletions
+11
-19
.gitmodules
.gitmodules
+2
-2
README.md
README.md
+8
-16
lmdeploy
lmdeploy
+1
-1
No files found.
.gitmodules
View file @
2c4a3749
[submodule "lmdeploy"]
path = lmdeploy
url = http://developer.hpccube.com/codes/
aicomponent
/lmdeploy
.git
branch = dtk
-23
.04-v0.
0.13
url = http
s
://developer.hpccube.com/codes/
OpenDAS
/lmdeploy
branch = dtk
24
.04-v0.
2.6
\ No newline at end of file
README.md
View file @
2c4a3749
...
...
@@ -26,13 +26,13 @@ docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dt
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:lmdeploy0.0.13_dtk23.04_torch1.13_py38
# <Host Path>主机端路径
# <Container Path>容器映射路径
docker run
-it
--name
codellama
--shm-size
=
1024G
--device
=
/dev/kfd
--device
=
/dev/dri/
--cap-add
=
SYS_PTRACE
--security-opt
seccomp
=
unconfined
--ulimit
memlock
=
-1
:-1
--ipc
=
host
--network
host
--group-add
video
-v
<Host Path>:<Container Path>
i
mage
.sourcefind.cn:5000/dcu/admin/base/custom:lmdeploy0.0.13_dtk23.04_torch1.13_py38
/bin/bash
docker run
-it
--name
codellama
--shm-size
=
1024G
--device
=
/dev/kfd
--device
=
/dev/dri/
--cap-add
=
SYS_PTRACE
--security-opt
seccomp
=
unconfined
--ulimit
memlock
=
-1
:-1
--ipc
=
host
--network
host
--group-add
video
-v
<Host Path>:<Container Path>
<I
mage
ID>
/bin/bash
# 如果需要使用dtk23.10,请使用如下基础镜像
docker pull image.sourcefind.cn:5000/dcu/admin/base/
custom:lmdeploy-dtk2310-torch1
.1
3
-py3
8
docker pull image.sourcefind.cn:5000/dcu/admin/base/
pytorch:2.1.0-ubuntu20.04-dtk24.04
.1-py3
.10
# <Host Path>主机端路径
# <Container Path>容器映射路径
docker run
-it
--name
codellama
--shm-size
=
1024G
-v
/opt/hyhal:/opt/hyhal
--device
=
/dev/kfd
--device
=
/dev/dri/
--cap-add
=
SYS_PTRACE
--security-opt
seccomp
=
unconfined
--ulimit
memlock
=
-1
:-1
--ipc
=
host
--network
host
--group-add
video
-v
<Host Path>:<Container Path>
i
mage
.sourcefind.cn:5000/dcu/admin/base/custom:lmdeploy-dtk23.10-torch1.13-py38
/bin/bash
docker run
-it
--name
codellama
--shm-size
=
1024G
-v
/opt/hyhal:/opt/hyhal
--device
=
/dev/kfd
--device
=
/dev/dri/
--cap-add
=
SYS_PTRACE
--security-opt
seccomp
=
unconfined
--ulimit
memlock
=
-1
:-1
--ipc
=
host
--network
host
--group-add
video
-v
<Host Path>:<Container Path>
<I
mage
ID>
/bin/bash
```
> [!NOTE]
...
...
@@ -89,14 +89,6 @@ cd .. && python3 setup.py install
| 指令微调模型 | Y | Y(7B,13B), N(34B) | Y | N |
### 运行
根据上述的模型和能力关系表,下载感兴趣的模型。执行如下的命令,把模型权重转成 turbomind 要求的格式:
```
shell
# 转模型格式,转换后的模型会生成在./workspace目录中
# 其中--tp设置为你需要使用的gpu数,tp需要设置为2^n,如果tp设置的不是1,则后续模型的运行命令中也需要带上这个参数与模型对应
lmdeploy convert codellama /path/of/codellama/model
--tp
1
```
接下来,可参考如下章节,在控制台与 codellama 进行交互式对话。
**注意**
:
...
...
@@ -118,7 +110,7 @@ lmdeploy chat turbomind ./workspace --cap infilling
输入的代码块中要包含
`<FILL>`
,比如:
```
```
python
def
remove_non_ascii
(
s
:
str
)
->
str
:
""" <FILL>
return result
...
...
@@ -128,7 +120,7 @@ def remove_non_ascii(s: str) -> str:
### 对话
```
```
bash
lmdeploy chat turbomind ./workspace
--cap
chat
--sys-instruct
"Provide answers in Python"
```
...
...
@@ -136,7 +128,7 @@ lmdeploy chat turbomind ./workspace --cap chat --sys-instruct "Provide answers i
### Python 专项
```
```
bash
lmdeploy chat turbomind ./workspace
--cap
python
```
...
...
@@ -152,7 +144,7 @@ lmdeploy chat turbomind ./workspace --cap python
```
shell
# --instance_num: turbomind推理实例的个数。可理解为支持的最大并发数
# --tp: 在 tensor parallel时,使用的GPU数量
lmdeploy serve api_server ./workspace
--server
_
name
0.0.0.0
--server
_
port
${
server_port
}
--instance_num
32
--tp
1
lmdeploy serve api_server ./workspace
--server
-
name
0.0.0.0
--server
-
port
${
server_port
}
--instance_num
32
--tp
1
```
打开
`http://{server_ip}:{server_port}`
,即可访问 swagger,查阅 RESTful API 的详细信息。
...
...
@@ -170,7 +162,7 @@ lmdeploy serve api_client restful_api_url
# restful_api_url 就是 api_server 产生的,比如 http://localhost:23333
# server_ip 和 server_port 是用来提供 gradio ui 访问服务的
# 例子: lmdeploy serve gradio http://localhost:23333 --server_name localhost --server_port 6006 --restful_api True
lmdeploy serve gradio restful_api_url
--server
_
name
${
server_ip
}
--server
_
port
${
server_port
}
--restful_api
True
lmdeploy serve gradio restful_api_url
--server
-
name
${
server_ip
}
--server
-
port
${
server_port
}
--restful_api
True
```
关于 RESTful API的详细介绍,请参考
[
这份
](
https://developer.hpccube.com/codes/aicomponent/lmdeploy/-/blob/dtk23.04-v0.0.13/docs/zh_cn/restful_api.md
)
文档。
...
...
lmdeploy
@
858087a6
Subproject commit
e432dbb0e56caaf319b9c9d7b79eb8106852dc91
Subproject commit
858087a625c1dc431ab8b174331dfc95210f6e3a
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment