Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
CodeLlama_lmdeploy
Commits
facfae87
Commit
facfae87
authored
Aug 31, 2024
by
shantf
Browse files
update README.md
parent
82e4fbb7
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
26 additions
and
32 deletions
+26
-32
README.md
README.md
+26
-32
No files found.
README.md
View file @
facfae87
...
...
@@ -22,28 +22,18 @@ Code Llama 是一组预训练和微调的生成文本模型,其规模从 7 亿
提供光源拉取推理的docker镜像:
```
bash
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
(
推荐
)
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:lmdeploy0.0.13_dtk23.04_torch1.13_py38
# <Host Path>主机端路径
# <Container Path>容器映射路径
docker run
-it
--name
codellama
--shm-size
=
1024G
--device
=
/dev/kfd
--device
=
/dev/dri/
--cap-add
=
SYS_PTRACE
--security-opt
seccomp
=
unconfined
--ulimit
memlock
=
-1
:-1
--ipc
=
host
--network
host
--group-add
video
-v
<Host Path>:<Container Path> <Image ID> /bin/bash
# 如果需要使用dtk23.10,请使用如下基础镜像
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
# <Host Path>主机端路径
# <Container Path>容器映射路径
docker run
-it
--name
codellama
--shm-size
=
1024G
-v
/opt/hyhal:/opt/hyhal
--device
=
/dev/kfd
--device
=
/dev/dri/
--cap-add
=
SYS_PTRACE
--security-opt
seccomp
=
unconfined
--ulimit
memlock
=
-1
:-1
--ipc
=
host
--network
host
--group-add
video
-v
<Host Path>:<Container Path> <Image ID> /bin/bash
docker run
-it
--name
codellama
--shm-size
=
1024G
--device
=
/dev/kfd
--device
=
/dev/dri/
--cap-add
=
SYS_PTRACE
--security-opt
seccomp
=
unconfined
--ulimit
memlock
=
-1
:-1
--ipc
=
host
--network
host
--group-add
video
-v
/opt/hyhal:/opt/hyhal:ro
-v
<Host Path>:<Container Path> <Image ID> /bin/bash
#step 1
cd
lmdeploy
pip
install
-r
requirements.txt
-i
https://pypi.tuna.tsinghua.edu.cn/simple
#step 2
source
/opt/dtk/cuda/env.sh
```
> [!NOTE]
>
> > 使用lmdeploy0.0.13_dtk23.04_torch1.13_py38如果遇到 importError:libgemm multiB int4.so: cannot open shared obiect file: No such file or directory
> >
> > 解决方法:
> >
> > ```bash
> > rm /usr/local/lib/python3.8/site-packages/_turbomind.cpython-38-x86_64-linux-gnu.so
> > ```
## 推理
...
...
@@ -88,15 +78,7 @@ cd .. && python3 setup.py install
| Python微调模型 | Y | N | N | Y |
| 指令微调模型 | Y | Y(7B,13B), N(34B) | Y | N |
### 运行前
```
bash
#step 1
cd
lmdeploy
pip
install
-r
requirements.txt
-i
https://pypi.tuna.tsinghua.edu.cn/simple
#step 2
source
/opt/dtk/cuda/env.sh
```
### 运行
...
...
@@ -111,12 +93,14 @@ source /opt/dtk/cuda/env.sh
```
shell
lmdeploy chat turbomind ./workspace
--cap
completion
./workspace:模型路径
```
### 代码填空
```
shell
lmdeploy chat turbomind ./workspace
--cap
infilling
./workspace:模型路径
```
输入的代码块中要包含
`<FILL>`
,比如:
...
...
@@ -132,15 +116,15 @@ def remove_non_ascii(s: str) -> str:
### 对话
```
bash
lmdeploy chat turbomind ./workspace
--cap
chat
--sys-instruct
"Provide answers in Python"
lmdeploy chat turbomind ./workspace
--cap
chat
./workspace:模型路径
```
可以把
`--sys-instruct`
的指令换成 codellama 支持的其他变成语言。
### Python 专项
```
bash
lmdeploy chat turbomind ./workspace
--cap
python
./workspace:模型路径
```
建议这里部署 Python 微调模型
...
...
@@ -153,9 +137,8 @@ lmdeploy chat turbomind ./workspace --cap python
启动 sever 的方式是:
```
shell
# --instance_num: turbomind推理实例的个数。可理解为支持的最大并发数
# --tp: 在 tensor parallel时,使用的GPU数量
lmdeploy serve api_server ./workspace
--server-name
0.0.0.0
--server-port
${
server_port
}
--instance_num
32
--tp
1
lmdeploy serve api_server ./workspace
--server-name
0.0.0.0
--server-port
${
server_port
}
--tp
1
```
打开
`http://{server_ip}:{server_port}`
,即可访问 swagger,查阅 RESTful API 的详细信息。
...
...
@@ -172,8 +155,19 @@ lmdeploy serve api_client restful_api_url
```
shell
# restful_api_url 就是 api_server 产生的,比如 http://localhost:23333
# server_ip 和 server_port 是用来提供 gradio ui 访问服务的
# 例子: lmdeploy serve gradio http://localhost:23333 --server_name localhost --server_port 6006 --restful_api True
lmdeploy serve gradio restful_api_url
--server-name
${
server_ip
}
--server-port
${
server_port
}
--restful_api
True
# 例子: lmdeploy serve gradio http://localhost:23333 --server-name localhost --server-port 6006
# --server_port要和restful_api_url的端口不一样。
lmdeploy serve gradio restful_api_url
--server-name
${
server_ip
}
--server-port
${
server_port
}
注意:如果打不开网页,则按照一下办法调整
从https://github.com/bumblebeeMMa/DownLoad_frpc_linux_amd64 下载frpc_linux_amd64文件;
本地改名为frpc_linux_amd64_v0.2
上传到gradio安装路径下面,gradio安装包路径可以使用如下方法找到,在终端输入下面语句:
```
bash
python
import gradio
gradio
```
```
关于 RESTful API的详细介绍,请参考
[
这份
](
https://developer.hpccube.com/codes/aicomponent/lmdeploy/-/blob/dtk23.04-v0.0.13/docs/zh_cn/restful_api.md
)
文档。
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment