Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_pytorch
Commits
59851047
"torchvision/vscode:/vscode.git/clone" did not exist on "22bc44ed76ec314e8c27df1575757d82f2eda0a3"
Commit
59851047
authored
Aug 12, 2024
by
dcuai
Browse files
Update README.md
parent
ca1c52dd
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
13 additions
and
9 deletions
+13
-9
README.md
README.md
+13
-9
No files found.
README.md
View file @
59851047
...
...
@@ -24,7 +24,7 @@ ResNet50使用了多个具有残差连接的残差块来解决梯度消失或梯
```
拉取镜像:
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:
1
.1
0
.0-
centos7.6-dtk-22.10.1-py37-latest
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:
2
.1.0-
ubuntu20.04-dtk24.04.1-py3.10
创建并启动容器:
docker run --shm-size 16g --network=host --name=resnet50_pytorch --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $PWD/resnet50-pytorch:/home/resnet50_pytorch -it <Your Image ID> bash
安装依赖:
...
...
@@ -45,11 +45,11 @@ docker run --rm --shm-size 16g --network=host --name=resnet50_pytorch --privileg
https://developer.hpccube.com/tool/
```
DTK驱动:dtk2
2.10
.1
python:python3.
7
torch:
1
.1
0
.0
torchvision:0.1
0
.0
apex:
0
.1
DTK驱动:dtk2
4.04
.1
python:python3.
10
torch:
2
.1.0
torchvision:0.1
6
.0
apex:
1
.1
```
`Tips:以上DTK、python、torch等DCU相关工具包,版本需要严格一一对应`
...
...
@@ -100,13 +100,17 @@ python3 train.py --batch-size=64 --arch=resnet50 -j 6 --epochs=90 --amp --opt-le
### 单机四卡训练(单精度)
```
mpirun --allow-run-as-root --bind-to none -np 4 scrips/single_process.sh localhost resnet50 64
cd scrips
chmod +x single_process.sh
mpirun --allow-run-as-root --bind-to none -np 4 single_process.sh localhost resnet50 64
```
### 单机四卡训练(混合精度)
```
mpirun --allow-run-as-root --bind-to none -np 4 scrips/single_process_amp.sh localhost resnet50 64
cd scrips
chmod +x single_process_amp.sh
mpirun --allow-run-as-root --bind-to none -np 4 single_process_amp.sh localhost resnet50 64
```
## result
...
...
@@ -140,6 +144,6 @@ mpirun --allow-run-as-root --bind-to none -np 4 scrips/single_process_amp.sh loc
https://developer.hpccube.com/codes/modelzoo/resnet50-pytorch
# 参考
# 参考
资料
https://github.com/pytorch/examples/tree/master/imagenet
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment