update README.md

061dc9a2 · shantf · 3ccd1a02 · 061dc9a2 · 061dc9a2
Commit 061dc9a2 authored Sep 22, 2024 by shantf
Hide whitespace changes
Inline Side-by-side

Showing with 11 additions and 9 deletions

Dockerfile Dockerfile +2 -2

README.md README.md +9 -7

No files found.
--- a/Dockerfile
+++ b/Dockerfile
-FROM image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04-py38-latest
+FROM image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
 COPY requirements.txt requirements.txt
-RUN source /opt/dtk-23.04/env.sh
+RUN source /opt/dtk/env.sh
 RUN cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo 'Asia/Shanghai' >/etc/timezone 
 ENV LANG C.UTF-8
 RUN pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
--- a/README.md
+++ b/README.md
@@ -32,12 +32,12 @@ ChatGLM2-6B 是开源中英双语对话模型 ChatGLM-6B 的第二代版本。Ch
 ### Docker(方式一)
 推荐使用docker方式运行，提供拉取的docker镜像：
 ```
-docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04-py38-latest
+docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
 ```

 进入docker，安装docker中没有的依赖:
 ```
-docker run -dit --network=host --name=chatglm --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1 image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04-py38-latest
+docker run -dit --network=host --name=chatglm --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 -v /opt/hyhal:/opt/hyhal:ro --ulimit memlock=-1:-1 image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
 docker exec -it chatglm /bin/bash
 pip install transformers==4.28.0 -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
 pip install accelerate sentencepiece mdtex2html gradio rouge_chinese nltk jieba datasets==2.20.0 protobuf peft==0.5.0 pydantic==1.10.9 -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
@@ -45,20 +45,20 @@ pip install accelerate sentencepiece mdtex2html gradio rouge_chinese nltk jieba
 ### Dockerfile(方式二)
 ```
 docker build -t chatglm2:latest .
-docker run -dit --network=host --name=chatglm2 --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1 chatglm2:latest
+docker run -dit --network=host --name=chatglm2 --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /opt/hyhal:/opt/hyhal:ro --ulimit stack=-1:-1 --ulimit memlock=-1:-1 chatglm2:latest
 docker exec -it chatglm2 /bin/bash
 ```

 ### Conda（方法三）
 1. 创建conda虚拟环境：
 ```
-conda create -n chatglm python=3.8
+conda create -n chatglm python=3.10
 ```

 2. 关于本项目DCU显卡所需的工具包、深度学习库等均可从[光合](https://developer.hpccube.com/tool/)开发者社区下载安装。
- [DTK 23.04](https://cancon.hpccube.com:65024/1/main/DTK-23.04.1)
- [Pytorch 1.13.1](https://cancon.hpccube.com:65024/4/main/pytorch/dtk23.04)
- [Deepspeed 0.9.2](https://cancon.hpccube.com:65024/4/main/deepspeed/dtk23.04)
+- DTK 24.04.1
+- Pytorch 2.1.0
+- Deepspeed 0.12.3

    Tips：以上dtk驱动、python、deepspeed等工具版本需要严格一一对应。

@@ -97,6 +97,8 @@ SCNet快速下载地址：
    bash ptuning_train.sh
 ```
 注意：请根据自己的需求配置其中的模型路径、数据集路径、batchsize、学习率等参数；
+如果需要更改指定的显卡编号，可以在ptuning_train.sh脚本中使用--include=localhost，替换CUDA_VISIBLE_DEVICES。
+比如使用4,5,6,7号卡，则将CUDA_VISIBLE_DEVICES=0,1,2,3删除，在deepspeed后面增加--include=localhost:4,5,6,7。

 #### 推理测评
 在 P-tuning v2 训练时模型只保存 PrefixEncoder 部分的参数，所以在推理时需要同时加载原 ChatGLM-6B 模型以及 PrefixEncoder 的权重，可直接运行一下命令：