Commit ee7bfade authored by chenzk's avatar chenzk
Browse files

v1.1

parent e4575be9
Pipeline #510 canceled with stage
......@@ -47,8 +47,8 @@ data
...
val
|
n04286575
n04596742
n01440764
n01824575
...
test
|
......@@ -87,13 +87,11 @@ export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/public/home/xxx/anaconda3/envs/megatr
```
**单机多卡**(需先单独申请线上节点):
```
cd examples
sh dspvit_1node.sh
sh examples/dspvit_1node.sh
```
**单机单卡**(需先单独申请线上节点):
```
cd examples
dspvit_1dcu.sh
sh examples/dspvit_1dcu.sh
```
### 二、mpirun训练
注释[`arguments.py`](./megatron/arguments.py)中的rank和world_size:
......
FROM image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.10.0-centos7.6-dtk-23.04-py38-latest
ENV DEBIAN_FRONTEND=noninteractive
# RUN yum update && yum install -y git cmake wget build-essential
RUN source /opt/dtk-23.04/env.sh
# 安装pip相关依赖
COPY requirements.txt requirements.txt
RUN pip3 install -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com -r requirements.txt
datasets
nltk
numpy==1.23.5
parameterized
pybind11
regex
six
tensorboard
transformers
# versions from HF transformers
black==21.4b0
isort>=5.5.4
ninja
# 模型编码
modelCode=342
modelCode=360
# 模型名称
modelName=megatron-deepspeed-vit_pytorch
# 模型描述
modelDescription=基于transformer的图像分类算法
# 应用场景
appScenario=推理,训练,图像分类
appScenario=推理,训练,图像分类,制造,环境,医疗,气象
# 框架类型
frameType=PyTorch
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment