Commit ed085104 authored by wangsen's avatar wangsen
Browse files

run K100_AI

parent bc231b17
File added
......@@ -23,20 +23,54 @@ VisualGLM-6B 由 SwissArmyTransformer(简称sat) 库训练,这是一个支持T
## 环境配置
### Docker(方法一)
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:visualglm-6b_pytorch-latest
docker run -it -v /path/your_data/:/path/your_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
docker run -it -v /opt/hyhal:/opt/hyhal:ro --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
```
### Dockerfile(方法二)
此处提供dockerfile的使用方法, dockerfile在docker文件夹中
```
cd docker
docker build --no-cache -t xxx:latest .
docker run -it -v /path/your_data/:/path/your_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
docker build --no-cache -t visualglm:latest .
docker run -dit --shm-size 80g --network=host --name=visualglm --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /opt/hyhal/:/opt/hyhal/:ro visualglm:latest /bin/bash
docker exec -it visualglm /bin/bash
```
Conda(方式三)
1.创建conda虚拟环境:
```
conda create -n visualglm python=3.10
conda activate visualglm
```
2.关于本项目DCU显卡所需的工具包、深度学习库等均可从光合开发者社区下载安装。
- [DTK 24.04.1](https://cancon.hpccube.com:65024/directlink/1/DTK-24.04.1/Ubuntu20.04.1/DTK-24.04.1-Ubuntu20.04.1-x86_64.tar.gz)
- [Pytorch 2.1](https://cancon.hpccube.com:65024/directlink/4/pytorch/DAS1.2/torch-2.1.0+das.opt1.dtk24042-cp310-cp310-manylinux_2_28_x86_64.whl)
- [deepspeed](https://download.sourcefind.cn:65024/directlink/4/deepspeed/DAS1.1/deepspeed-0.12.3+gita724046.abi1.dtk2404.torch2.1.0-cp310-cp310-manylinux_2_31_x86_64.whl)
- [torchvision](https://download.sourcefind.cn:65024/directlink/4/vision/DAS1.1/torchvision-0.16.0+das1.1.git7d45932.abi1.dtk2404.torch2.1-cp310-cp310-manylinux_2_31_x86_64.whl)
Tips:以上dtk驱动、torch等工具版本需要严格一一对应。
3. 其它依赖库参照requirements.txt安装:
```
pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
```
## 数据集
## 模型下载
Hugging Face模型下载地址:
- [visualglm6b]( https://huggingface.co/THUDM/visualglm-6b)
## 推理
### 代码推理
......
FROM image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
COPY requirements.txt requirements.txt
RUN source /opt/dtk-24.04.1/env.sh
RUN cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo 'Asia/Shanghai' >/etc/timezone
ENV LANG C.UTF-8
RUN pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
\ No newline at end of file
transformers==4.33.0
SwissArmyTransformer==0.4.4
numpy==1.24.3
pillow==10.4.0transformers==4.40.0
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment