Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
d7f606e0
Commit
d7f606e0
authored
Sep 04, 2023
by
qianyj
Browse files
Update README
parent
e7b3f4b1
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
6 additions
and
6 deletions
+6
-6
README.md
README.md
+2
-2
scripts-run/single_process_xla.sh
scripts-run/single_process_xla.sh
+4
-4
No files found.
README.md
View file @
d7f606e0
...
...
@@ -129,7 +129,7 @@ sed指令只需要执行一次,添加支持多卡运行的代码
sh /opt/dtk/.hip/replace_origin.sh
export PYTHONPATH=/home/resnet50_tensorflow:$PYTHONPATH
在resnet_ctl_imagenet_main.py中添加环境变量os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/opt/dtk/amdgcn/bitcode”
TF_XLA_FLAGS="--tf_xla_auto_jit=
2
" python3 official/vision/image_classification/resnet/resnet_ctl_imagenet_main.py --data_dir=/path/to/{ImageNet-tensorflow_data_dir} --model_dir=/path/to/{model_save_dir} --batch_size=128 --num_gpus=1 --train_epochs=90 --use_synthetic_data=false --dtype=fp16
TF_XLA_FLAGS="--tf_xla_auto_jit=
1
" python3 official/vision/image_classification/resnet/resnet_ctl_imagenet_main.py --data_dir=/path/to/{ImageNet-tensorflow_data_dir} --model_dir=/path/to/{model_save_dir} --batch_size=128 --num_gpus=1 --train_epochs=90 --use_synthetic_data=false --dtype=fp16
#### 单机四卡训练指令
...
...
@@ -143,7 +143,7 @@ sed指令只需要执行一次,添加支持多卡运行的代码
sh /opt/dtk/.hip/replace_origin.sh
export PYTHONPATH=/home/resnet50_tensorflow:$PYTHONPATH
在resnet_ctl_imagenet_main.py中添加环境变量os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/opt/dtk/amdgcn/bitcode”
TF_XLA_FLAGS="--tf_xla_auto_jit=
2
" python3 official/vision/image_classification/resnet/resnet_ctl_imagenet_main.py --data_dir=/path/to/{ImageNet-tensorflow_data_dir} --model_dir=/path/to/{model_save_dir} --batch_size=512 --num_gpus=4 --train_epochs=90 --use_synthetic_data=false --dtype=fp16
TF_XLA_FLAGS="--tf_xla_auto_jit=
1
" python3 official/vision/image_classification/resnet/resnet_ctl_imagenet_main.py --data_dir=/path/to/{ImageNet-tensorflow_data_dir} --model_dir=/path/to/{model_save_dir} --batch_size=512 --num_gpus=4 --train_epochs=90 --use_synthetic_data=false --dtype=fp16
#### 多机多卡训练指令(以单机四卡模拟四卡四进程为例)
...
...
scripts-run/single_process_xla.sh
View file @
d7f606e0
...
...
@@ -7,19 +7,19 @@ APP="python3 ./official/vision/image_classification/resnet/resnet_ctl_imagenet_m
case
${
lrank
}
in
[
0]
)
export
HIP_VISIBLE_DEVICES
=
0
TF_XLA_FLAGS
=
"--tf_xla_auto_jit=
2
"
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
TF_XLA_FLAGS
=
"--tf_xla_auto_jit=
1
"
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
;;
[
1]
)
export
HIP_VISIBLE_DEVICES
=
1
TF_XLA_FLAGS
=
"--tf_xla_auto_jit=
2
"
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
TF_XLA_FLAGS
=
"--tf_xla_auto_jit=
1
"
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
;;
[
2]
)
export
HIP_VISIBLE_DEVICES
=
2
TF_XLA_FLAGS
=
"--tf_xla_auto_jit=
2
"
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
TF_XLA_FLAGS
=
"--tf_xla_auto_jit=
1
"
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
;;
[
3]
)
export
HIP_VISIBLE_DEVICES
=
3
TF_XLA_FLAGS
=
"--tf_xla_auto_jit=
2
"
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
TF_XLA_FLAGS
=
"--tf_xla_auto_jit=
1
"
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
;;
esac
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment