run K100_AI

ed085104 · wangsen · bc231b17 · ed085104 · ed085104 · ed085104
Commit ed085104 authored Oct 21, 2024 by wangsen
Hide whitespace changes
Inline Side-by-side

Showing with 48 additions and 4 deletions

.README.md.swp .README.md.swp +0 -0

README.md README.md +38 -4

dockerfile dockerfile +6 -0

requirements.txt requirements.txt +4 -0

No files found.
--- a/.README.md.swp
+++ b/.README.md.swp
--- a/README.md
+++ b/README.md
@@ -23,20 +23,54 @@ VisualGLM-6B 由 SwissArmyTransformer(简称sat) 库训练，这是一个支持T
 ## 环境配置
 ### Docker（方法一）
 ```
-docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:visualglm-6b_pytorch-latest 
-docker run -it -v /path/your_data/:/path/your_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
+docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
+docker run -it -v /opt/hyhal:/opt/hyhal:ro --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
 ```
 ### Dockerfile（方法二）
 此处提供dockerfile的使用方法， dockerfile在docker文件夹中
 ```
 cd docker
-docker build --no-cache -t xxx:latest .
-docker run -it -v /path/your_data/:/path/your_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
+docker build --no-cache -t visualglm:latest .
+docker run -dit --shm-size 80g --network=host --name=visualglm --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /opt/hyhal/:/opt/hyhal/:ro visualglm:latest /bin/bash
+docker exec -it visualglm  /bin/bash
 ```

+
+
+Conda(方式三)
+
+1.创建conda虚拟环境：
+
+```
+conda create -n visualglm  python=3.10
+conda activate visualglm 
+```
+2.关于本项目DCU显卡所需的工具包、深度学习库等均可从光合开发者社区下载安装。
+- [DTK 24.04.1](https://cancon.hpccube.com:65024/directlink/1/DTK-24.04.1/Ubuntu20.04.1/DTK-24.04.1-Ubuntu20.04.1-x86_64.tar.gz)
+- [Pytorch 2.1](https://cancon.hpccube.com:65024/directlink/4/pytorch/DAS1.2/torch-2.1.0+das.opt1.dtk24042-cp310-cp310-manylinux_2_28_x86_64.whl)
+- [deepspeed](https://download.sourcefind.cn:65024/directlink/4/deepspeed/DAS1.1/deepspeed-0.12.3+gita724046.abi1.dtk2404.torch2.1.0-cp310-cp310-manylinux_2_31_x86_64.whl)
+- [torchvision](https://download.sourcefind.cn:65024/directlink/4/vision/DAS1.1/torchvision-0.16.0+das1.1.git7d45932.abi1.dtk2404.torch2.1-cp310-cp310-manylinux_2_31_x86_64.whl)
+
+Tips：以上dtk驱动、torch等工具版本需要严格一一对应。
+
+
+3. 其它依赖库参照requirements.txt安装：
+```
+pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
+```
+
+
+
 ## 数据集
 无
+
+
+## 模型下载
+Hugging Face模型下载地址：
    
+- [visualglm6b]( https://huggingface.co/THUDM/visualglm-6b)
+
+
 ## 推理

 ### 代码推理

--- a/dockerfile
+++ b/dockerfile
+FROM image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
+COPY requirements.txt requirements.txt
+RUN source /opt/dtk-24.04.1/env.sh
+RUN cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo 'Asia/Shanghai' >/etc/timezone 
+ENV LANG C.UTF-8
+RUN pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
\ No newline at end of file
--- a/requirements.txt
+++ b/requirements.txt
+transformers==4.33.0
+SwissArmyTransformer==0.4.4
+numpy==1.24.3
+pillow==10.4.0transformers==4.40.0