Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
dcuai
dlexamples
Commits
c320b6ef
Commit
c320b6ef
authored
Apr 15, 2022
by
zhenyi
Browse files
tf2 detection
parent
0fc002df
Changes
195
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
795 additions
and
0 deletions
+795
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/docker/build_tf2.sh
...puteVision/Detection/MaskRCNN/scripts/docker/build_tf2.sh
+32
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/docker/launch_tf1.sh
...uteVision/Detection/MaskRCNN/scripts/docker/launch_tf1.sh
+26
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/docker/launch_tf2.sh
...uteVision/Detection/MaskRCNN/scripts/docker/launch_tf2.sh
+26
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/evaluation.sh
...2x/ComputeVision/Detection/MaskRCNN/scripts/evaluation.sh
+34
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/evaluation_AMP.sh
...omputeVision/Detection/MaskRCNN/scripts/evaluation_AMP.sh
+34
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_1GPU.sh
...2x/ComputeVision/Detection/MaskRCNN/scripts/train_1GPU.sh
+40
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_1GPU_XLA.sh
...omputeVision/Detection/MaskRCNN/scripts/train_1GPU_XLA.sh
+40
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_4GPU.sh
...2x/ComputeVision/Detection/MaskRCNN/scripts/train_4GPU.sh
+50
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_8GPU.sh
...2x/ComputeVision/Detection/MaskRCNN/scripts/train_8GPU.sh
+48
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_AMP_1GPU.sh
...omputeVision/Detection/MaskRCNN/scripts/train_AMP_1GPU.sh
+40
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_AMP_4GPU.sh
...omputeVision/Detection/MaskRCNN/scripts/train_AMP_4GPU.sh
+50
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_AMP_8GPU.sh
...omputeVision/Detection/MaskRCNN/scripts/train_AMP_8GPU.sh
+48
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/.gitkeep
...rFlow2x/ComputeVision/Detection/MaskRCNN/weights/.gitkeep
+0
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/extract_RN50_weights.py
...Vision/Detection/MaskRCNN/weights/extract_RN50_weights.py
+120
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/inspect_checkpoint.py
...teVision/Detection/MaskRCNN/weights/inspect_checkpoint.py
+159
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/mask-rcnn/.gitkeep
...mputeVision/Detection/MaskRCNN/weights/mask-rcnn/.gitkeep
+0
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/mask-rcnn/1555659850/.gitkeep
.../Detection/MaskRCNN/weights/mask-rcnn/1555659850/.gitkeep
+0
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/pb_to_ckpt.py
...2x/ComputeVision/Detection/MaskRCNN/weights/pb_to_ckpt.py
+48
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/resnet/.gitkeep
.../ComputeVision/Detection/MaskRCNN/weights/resnet/.gitkeep
+0
-0
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/resnet/extracted_from_maskrcnn/.gitkeep
.../MaskRCNN/weights/resnet/extracted_from_maskrcnn/.gitkeep
+0
-0
No files found.
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/docker/build_tf2.sh
0 → 100644
View file @
c320b6ef
#!/bin/bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
CONTAINER_TF2x_BASE
=
"nvcr.io/nvidia/tensorflow"
CONTAINER_TF2x_TAG
=
"20.06-tf2-py3"
# ======================== Refresh base image ======================== #
docker pull
"
${
CONTAINER_TF2x_BASE
}
:
${
CONTAINER_TF2x_TAG
}
"
# ========================== Build container ========================= #
echo
-e
"
\n\n
Building NVIDIA TF 2.x Container
\n\n
"
sleep
1
docker build
-t
joc_tensorflow_maskrcnn:tf2.1-py3
\
--build-arg
BASE_CONTAINER
=
"
${
CONTAINER_TF2x_BASE
}
"
\
--build-arg
IMG_TAG
=
"
${
CONTAINER_TF2x_TAG
}
"
\
--build-arg
FROM_IMAGE_NAME
=
"nvcr.io/nvidia/tensorflow:20.06-tf2-py3"
.
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/docker/launch_tf1.sh
0 → 100644
View file @
c320b6ef
#!/bin/bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
if
[
-z
"
$1
"
]
;
then
echo
"usage launch_tf1.sh [absolute data dir]"
exit
fi
nvidia-docker run
-it
--rm
\
--shm-size
=
2g
--ulimit
memlock
=
-1
--ulimit
stack
=
67108864
\
-v
$(
pwd
)
/weights/:/model/
\
-v
"
${
1
}
"
:/data/
\
joc_tensorflow_maskrcnn:tf1.x-py3
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/docker/launch_tf2.sh
0 → 100644
View file @
c320b6ef
#!/bin/bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
if
[
-z
"
$1
"
]
;
then
echo
"usage launch_tf2.sh [absolute data dir]"
exit
fi
nvidia-docker run
-it
--rm
\
--shm-size
=
2g
--ulimit
memlock
=
-1
--ulimit
stack
=
67108864
\
-v
$(
pwd
)
/weights/:/model/
\
-v
"
${
1
}
"
:/data/
\
joc_tensorflow_maskrcnn:tf2.1-py3
\ No newline at end of file
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/evaluation.sh
0 → 100644
View file @
c320b6ef
#!/usr/bin/env bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
rm
-rf
/result_tmp/
BASEDIR
=
"
$(
cd
"
$(
dirname
"
${
BASH_SOURCE
[0]
}
"
)
"
>
/dev/null 2>&1
&&
pwd
)
"
export
CUDA_VISIBLE_DEVICES
=
0
python
${
BASEDIR
}
/../mask_rcnn_main.py
\
--mode
=
"eval"
\
--eval_batch_size
=
8
\
--eval_samples
=
5000
\
--learning_rate_steps
=
"480000,640000"
\
--model_dir
=
"/result_tmp/"
\
--validation_file_pattern
=
"/data/val*.tfrecord"
\
--val_json_file
=
"/data/annotations/instances_val2017.json"
\
--use_batched_nms
\
--noamp
\
--noxla
\
--nouse_custom_box_proposals_op
\ No newline at end of file
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/evaluation_AMP.sh
0 → 100644
View file @
c320b6ef
#!/usr/bin/env bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
rm
-rf
/result_tmp/
BASEDIR
=
"
$(
cd
"
$(
dirname
"
${
BASH_SOURCE
[0]
}
"
)
"
>
/dev/null 2>&1
&&
pwd
)
"
export
CUDA_VISIBLE_DEVICES
=
0
python
${
BASEDIR
}
/../mask_rcnn_main.py
\
--mode
=
"eval"
\
--eval_batch_size
=
8
\
--eval_samples
=
5000
\
--learning_rate_steps
=
"480000,640000"
\
--model_dir
=
"/result_tmp/"
\
--validation_file_pattern
=
"/data/val*.tfrecord"
\
--val_json_file
=
"/data/annotations/instances_val2017.json"
\
--use_batched_nms
\
--amp
\
--noxla
\
--nouse_custom_box_proposals_op
\ No newline at end of file
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_1GPU.sh
0 → 100644
View file @
c320b6ef
#!/usr/bin/env bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
rm
-rf
/results
BASEDIR
=
"
$(
cd
"
$(
dirname
"
${
BASH_SOURCE
[0]
}
"
)
"
>
/dev/null 2>&1
&&
pwd
)
"
export
CUDA_VISIBLE_DEVICES
=
0
python
${
BASEDIR
}
/../mask_rcnn_main.py
\
--mode
=
"train_and_eval"
\
--checkpoint
=
"/model/resnet/resnet-nhwc-2018-02-07/model.ckpt-112603"
\
--eval_samples
=
5000
\
--init_learning_rate
=
0.005
\
--learning_rate_steps
=
"240000,320000"
\
--model_dir
=
"/results/"
\
--num_steps_per_eval
=
29568
\
--total_steps
=
360000
\
--train_batch_size
=
4
\
--eval_batch_size
=
8
\
--training_file_pattern
=
"/data/train*.tfrecord"
\
--validation_file_pattern
=
"/data/val*.tfrecord"
\
--val_json_file
=
"/data/annotations/instances_val2017.json"
\
--noamp
\
--use_batched_nms
\
--xla
\
--nouse_custom_box_proposals_op
\ No newline at end of file
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_1GPU_XLA.sh
0 → 100644
View file @
c320b6ef
#!/usr/bin/env bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
rm
-rf
/results
BASEDIR
=
"
$(
cd
"
$(
dirname
"
${
BASH_SOURCE
[0]
}
"
)
"
>
/dev/null 2>&1
&&
pwd
)
"
export
CUDA_VISIBLE_DEVICES
=
0
python
${
BASEDIR
}
/../mask_rcnn_main.py
\
--mode
=
"train_and_eval"
\
--checkpoint
=
"/model/resnet/resnet-nhwc-2018-10-14/model.ckpt-112602"
\
--eval_samples
=
5000
\
--init_learning_rate
=
0.005
\
--learning_rate_steps
=
"240000,320000"
\
--model_dir
=
"/results/"
\
--num_steps_per_eval
=
29568
\
--total_steps
=
360000
\
--train_batch_size
=
4
\
--eval_batch_size
=
8
\
--training_file_pattern
=
"/data/train*.tfrecord"
\
--validation_file_pattern
=
"/data/val*.tfrecord"
\
--val_json_file
=
"/data/annotations/instances_val2017.json"
\
--noamp
\
--use_batched_nms
\
--xla
\
--nouse_custom_box_proposals_op
\ No newline at end of file
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_4GPU.sh
0 → 100644
View file @
c320b6ef
#!/usr/bin/env bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
rm
-rf
/results
BASEDIR
=
"
$(
cd
"
$(
dirname
"
${
BASH_SOURCE
[0]
}
"
)
"
>
/dev/null 2>&1
&&
pwd
)
"
export
CUDA_VISIBLE_DEVICES
=
0,1,2,3
mpirun
\
-np
4
\
-H
localhost:4
\
-bind-to
none
\
-map-by
slot
\
-x
NCCL_DEBUG
=
VERSION
\
-x
LD_LIBRARY_PATH
\
-x
PATH
\
-mca
pml ob1
-mca
btl ^openib
\
--allow-run-as-root
\
python
${
BASEDIR
}
/../mask_rcnn_main.py
\
--mode
=
"train_and_eval"
\
--checkpoint
=
"/model/resnet/resnet-nhwc-2018-02-07/model.ckpt-112603"
\
--eval_samples
=
5000
\
--init_learning_rate
=
0.02
\
--learning_rate_steps
=
"60000,80000"
\
--model_dir
=
"/results/"
\
--num_steps_per_eval
=
7392
\
--total_steps
=
90000
\
--train_batch_size
=
4
\
--eval_batch_size
=
8
\
--training_file_pattern
=
"/data/train*.tfrecord"
\
--validation_file_pattern
=
"/data/val*.tfrecord"
\
--val_json_file
=
"/data/annotations/instances_val2017.json"
\
--noamp
\
--use_batched_nms
\
--xla
\
--nouse_custom_box_proposals_op
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_8GPU.sh
0 → 100644
View file @
c320b6ef
#!/usr/bin/env bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
rm
-rf
/results
BASEDIR
=
"
$(
cd
"
$(
dirname
"
${
BASH_SOURCE
[0]
}
"
)
"
>
/dev/null 2>&1
&&
pwd
)
"
mpirun
\
-np
8
\
-H
localhost:8
\
-bind-to
none
\
-map-by
slot
\
-x
NCCL_DEBUG
=
VERSION
\
-x
LD_LIBRARY_PATH
\
-x
PATH
\
-mca
pml ob1
-mca
btl ^openib
\
--allow-run-as-root
\
python
${
BASEDIR
}
/../mask_rcnn_main.py
\
--mode
=
"train_and_eval"
\
--checkpoint
=
"/model/resnet/resnet-nhwc-2018-02-07/model.ckpt-112603"
\
--eval_samples
=
5000
\
--init_learning_rate
=
0.04
\
--learning_rate_steps
=
"30000,40000"
\
--model_dir
=
"/results/"
\
--num_steps_per_eval
=
3696
\
--total_steps
=
45000
\
--train_batch_size
=
4
\
--eval_batch_size
=
8
\
--training_file_pattern
=
"/data/train*.tfrecord"
\
--validation_file_pattern
=
"/data/val*.tfrecord"
\
--val_json_file
=
"/data/annotations/instances_val2017.json"
\
--noamp
\
--use_batched_nms
\
--xla
\
--nouse_custom_box_proposals_op
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_AMP_1GPU.sh
0 → 100644
View file @
c320b6ef
#!/usr/bin/env bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
rm
-rf
/results
BASEDIR
=
"
$(
cd
"
$(
dirname
"
${
BASH_SOURCE
[0]
}
"
)
"
>
/dev/null 2>&1
&&
pwd
)
"
export
CUDA_VISIBLE_DEVICES
=
0
python
${
BASEDIR
}
/../mask_rcnn_main.py
\
--mode
=
"train_and_eval"
\
--checkpoint
=
"/model/resnet/resnet-nhwc-2018-02-07/model.ckpt-112603"
\
--eval_samples
=
5000
\
--init_learning_rate
=
0.005
\
--learning_rate_steps
=
"240000,320000"
\
--model_dir
=
"/results/"
\
--num_steps_per_eval
=
29568
\
--total_steps
=
360000
\
--train_batch_size
=
4
\
--eval_batch_size
=
8
\
--training_file_pattern
=
"/data/train*.tfrecord"
\
--validation_file_pattern
=
"/data/val*.tfrecord"
\
--val_json_file
=
"/data/annotations/instances_val2017.json"
\
--amp
\
--use_batched_nms
\
--xla
\
--nouse_custom_box_proposals_op
\ No newline at end of file
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_AMP_4GPU.sh
0 → 100644
View file @
c320b6ef
#!/usr/bin/env bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
rm
-rf
/results
BASEDIR
=
"
$(
cd
"
$(
dirname
"
${
BASH_SOURCE
[0]
}
"
)
"
>
/dev/null 2>&1
&&
pwd
)
"
export
CUDA_VISIBLE_DEVICES
=
0,1,2,3
mpirun
\
-np
4
\
-H
localhost:4
\
-bind-to
none
\
-map-by
slot
\
-x
NCCL_DEBUG
=
VERSION
\
-x
LD_LIBRARY_PATH
\
-x
PATH
\
-mca
pml ob1
-mca
btl ^openib
\
--allow-run-as-root
\
python
${
BASEDIR
}
/../mask_rcnn_main.py
\
--mode
=
"train_and_eval"
\
--checkpoint
=
"/model/resnet/resnet-nhwc-2018-02-07/model.ckpt-112603"
\
--eval_samples
=
5000
\
--init_learning_rate
=
0.02
\
--learning_rate_steps
=
"60000,80000"
\
--model_dir
=
"/results/"
\
--num_steps_per_eval
=
7392
\
--total_steps
=
90000
\
--train_batch_size
=
4
\
--eval_batch_size
=
8
\
--training_file_pattern
=
"/data/train*.tfrecord"
\
--validation_file_pattern
=
"/data/val*.tfrecord"
\
--val_json_file
=
"/data/annotations/instances_val2017.json"
\
--amp
\
--use_batched_nms
\
--xla
\
--nouse_custom_box_proposals_op
TensorFlow2x/ComputeVision/Detection/MaskRCNN/scripts/train_AMP_8GPU.sh
0 → 100644
View file @
c320b6ef
#!/usr/bin/env bash
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
rm
-rf
/results
BASEDIR
=
"
$(
cd
"
$(
dirname
"
${
BASH_SOURCE
[0]
}
"
)
"
>
/dev/null 2>&1
&&
pwd
)
"
mpirun
\
-np
8
\
-H
localhost:8
\
-bind-to
none
\
-map-by
slot
\
-x
NCCL_DEBUG
=
VERSION
\
-x
LD_LIBRARY_PATH
\
-x
PATH
\
-mca
pml ob1
-mca
btl ^openib
\
--allow-run-as-root
\
python
${
BASEDIR
}
/../mask_rcnn_main.py
\
--mode
=
"train_and_eval"
\
--checkpoint
=
"/model/resnet/resnet-nhwc-2018-02-07/model.ckpt-112603"
\
--eval_samples
=
5000
\
--init_learning_rate
=
0.04
\
--learning_rate_steps
=
"30000,40000"
\
--model_dir
=
"/results/"
\
--num_steps_per_eval
=
3696
\
--total_steps
=
45000
\
--train_batch_size
=
4
\
--eval_batch_size
=
8
\
--training_file_pattern
=
"/data/train*.tfrecord"
\
--validation_file_pattern
=
"/data/val*.tfrecord"
\
--val_json_file
=
"/data/annotations/instances_val2017.json"
\
--amp
\
--use_batched_nms
\
--xla
\
--nouse_custom_box_proposals_op
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/.gitkeep
0 → 100644
View file @
c320b6ef
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/extract_RN50_weights.py
0 → 100644
View file @
c320b6ef
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import
os
import
sys
import
getopt
import
logging
import
tensorflow
as
tf
"""
python weights/extract_RN50_weights.py
\
--checkpoint_dir=weights/mask-rcnn/1555659850/ckpt/model.ckpt
\
--save_to=weights/resnet/extracted_from_maskrcnn
\
--dry_run
python weights/extract_RN50_weights.py
\
--checkpoint_dir=weights/mask-rcnn/1555659850/ckpt/model.ckpt
\
--save_to=weights/resnet/extracted_from_maskrcnn
"""
usage_str
=
'python tensorflow_rename_variables.py --checkpoint_dir=weights/inception_v4.ckpt '
\
'--replace_from=substr --replace_to=substr --add_prefix=abc --dry_run'
def
rename
(
checkpoint_dir
,
save_to
,
dry_run
,
verbose
):
_
=
tf
.
train
.
get_checkpoint_state
(
checkpoint_dir
)
with
tf
.
compat
.
v1
.
Session
()
as
sess
:
total_vars_loaded
=
0
for
var_name
,
_
in
tf
.
train
.
list_variables
(
checkpoint_dir
):
if
"resnet50"
in
var_name
:
# Load the variable
var
=
tf
.
train
.
load_variable
(
checkpoint_dir
,
var_name
)
total_vars_loaded
+=
1
else
:
continue
if
not
dry_run
:
_
=
tf
.
Variable
(
var
,
name
=
var_name
[
9
:])
# remove "resnet50/"
# _ = tf.Variable(var, name=var_name)
if
verbose
:
print
(
'Loading Variable: %s.'
%
var_name
)
print
(
"Total Vars Loaded: %d"
%
total_vars_loaded
)
if
not
dry_run
:
if
not
os
.
path
.
isdir
(
save_to
):
os
.
makedirs
(
save_to
)
save_path
=
os
.
path
.
join
(
save_to
,
"resnet50.ckpt"
)
print
(
"Model save location: %s"
%
save_path
)
# Save the variables
saver
=
tf
.
compat
.
v1
.
train
.
Saver
()
sess
.
run
(
tf
.
compat
.
v1
.
global_variables_initializer
())
saver
.
save
(
sess
,
save_path
)
def
main
(
argv
):
checkpoint_dir
=
None
save_to
=
None
dry_run
=
False
verbose
=
False
try
:
opts
,
args
=
getopt
.
getopt
(
argv
,
'h'
,
[
'help='
,
'checkpoint_dir='
,
'save_to='
,
'verbose'
,
'dry_run'
]
)
except
getopt
.
GetoptError
:
print
(
usage_str
)
sys
.
exit
(
2
)
for
opt
,
arg
in
opts
:
if
opt
in
(
'-h'
,
'--help'
):
print
(
usage_str
)
sys
.
exit
()
elif
opt
==
'--checkpoint_dir'
:
checkpoint_dir
=
arg
elif
opt
==
'--save_to'
:
save_to
=
arg
elif
opt
==
'--verbose'
:
verbose
=
True
elif
opt
==
'--dry_run'
:
dry_run
=
True
if
not
checkpoint_dir
:
print
(
'Please specify a checkpoint_dir. Usage:'
)
print
(
usage_str
)
sys
.
exit
(
2
)
rename
(
checkpoint_dir
,
save_to
,
dry_run
,
verbose
)
if
__name__
==
'__main__'
:
logging
.
disable
(
logging
.
WARNING
)
os
.
environ
[
"TF_CPP_MIN_LOG_LEVEL"
]
=
"3"
main
(
sys
.
argv
[
1
:])
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/inspect_checkpoint.py
0 → 100644
View file @
c320b6ef
#! /usr/bin/python
# -*- coding: utf-8 -*-
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""A simple script for inspect checkpoint files."""
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
import
argparse
import
sys
import
numpy
as
np
from
tensorflow.python
import
pywrap_tensorflow
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
FLAGS
=
None
"""
Usgage: python inspect_checkpoint.py --file_name='weights/vgg16.ckpt'
Usgage: python inspect_checkpoint.py --file_name='weights/reprocessed/mobilenet.ckpt'
"""
def
print_tensors_in_checkpoint_file
(
file_name
,
tensor_name
,
all_tensors
,
all_tensor_names
=
False
):
"""Prints tensors in a checkpoint file.
If no `tensor_name` is provided, prints the tensor names and shapes
in the checkpoint file.
If `tensor_name` is provided, prints the content of the tensor.
Args:
file_name: Name of the checkpoint file.
tensor_name: Name of the tensor in the checkpoint file to print.
all_tensors: Boolean indicating whether to print all tensors.
all_tensor_names: Boolean indicating whether to print all tensor names.
"""
try
:
reader
=
pywrap_tensorflow
.
NewCheckpointReader
(
file_name
)
if
all_tensors
or
all_tensor_names
:
var_to_shape_map
=
reader
.
get_variable_to_shape_map
()
for
key
in
sorted
(
var_to_shape_map
):
print
(
"tensor_name: "
,
key
)
if
all_tensors
:
print
(
reader
.
get_tensor
(
key
))
elif
not
tensor_name
:
print
(
reader
.
debug_string
().
decode
(
"utf-8"
))
else
:
print
(
"tensor_name: "
,
tensor_name
)
print
(
reader
.
get_tensor
(
tensor_name
))
except
Exception
as
e
:
# pylint: disable=broad-except
print
(
str
(
e
))
if
"corrupted compressed block contents"
in
str
(
e
):
print
(
"It's likely that your checkpoint file has been compressed "
"with SNAPPY."
)
if
(
"Data loss"
in
str
(
e
)
and
(
any
([
e
in
file_name
for
e
in
[
".index"
,
".meta"
,
".data"
]]))):
proposed_file
=
"."
.
join
(
file_name
.
split
(
"."
)[
0
:
-
1
])
v2_file_error_template
=
"""
It's likely that this is a V2 checkpoint and you need to provide the filename
*prefix*. Try removing the '.' and extension. Try:
inspect checkpoint --file_name = {}"""
print
(
v2_file_error_template
.
format
(
proposed_file
))
def
parse_numpy_printoption
(
kv_str
):
"""Sets a single numpy printoption from a string of the form 'x=y'.
See documentation on numpy.set_printoptions() for details about what values
x and y can take. x can be any option listed there other than 'formatter'.
Args:
kv_str: A string of the form 'x=y', such as 'threshold=100000'
Raises:
argparse.ArgumentTypeError: If the string couldn't be used to set any
nump printoption.
"""
k_v_str
=
kv_str
.
split
(
"="
,
1
)
if
len
(
k_v_str
)
!=
2
or
not
k_v_str
[
0
]:
raise
argparse
.
ArgumentTypeError
(
"'%s' is not in the form k=v."
%
kv_str
)
k
,
v_str
=
k_v_str
printoptions
=
np
.
get_printoptions
()
if
k
not
in
printoptions
:
raise
argparse
.
ArgumentTypeError
(
"'%s' is not a valid printoption."
%
k
)
v_type
=
type
(
printoptions
[
k
])
if
v_type
is
type
(
None
):
raise
argparse
.
ArgumentTypeError
(
"Setting '%s' from the command line is not supported."
%
k
)
try
:
v
=
(
v_type
(
v_str
)
if
v_type
is
not
bool
else
flags
.
BooleanParser
().
parse
(
v_str
))
except
ValueError
as
e
:
raise
argparse
.
ArgumentTypeError
(
e
.
message
)
np
.
set_printoptions
(
**
{
k
:
v
})
def
main
(
unused_argv
):
if
not
FLAGS
.
file_name
:
print
(
"Usage: inspect_checkpoint --file_name=checkpoint_file_name "
"[--tensor_name=tensor_to_print] "
"[--all_tensors] "
"[--all_tensor_names] "
"[--printoptions]"
)
sys
.
exit
(
1
)
else
:
print_tensors_in_checkpoint_file
(
FLAGS
.
file_name
,
FLAGS
.
tensor_name
,
FLAGS
.
all_tensors
,
FLAGS
.
all_tensor_names
)
if
__name__
==
"__main__"
:
parser
=
argparse
.
ArgumentParser
()
parser
.
register
(
"type"
,
"bool"
,
lambda
v
:
v
.
lower
()
==
"true"
)
parser
.
add_argument
(
"--file_name"
,
type
=
str
,
default
=
""
,
help
=
"Checkpoint filename. "
"Note, if using Checkpoint V2 format, file_name is the "
"shared prefix between all files in the checkpoint."
)
parser
.
add_argument
(
"--tensor_name"
,
type
=
str
,
default
=
""
,
help
=
"Name of the tensor to inspect"
)
parser
.
add_argument
(
"--all_tensors"
,
nargs
=
"?"
,
const
=
True
,
type
=
"bool"
,
default
=
False
,
help
=
"If True, print the names and values of all the tensors."
)
parser
.
add_argument
(
"--all_tensor_names"
,
nargs
=
"?"
,
const
=
True
,
type
=
"bool"
,
default
=
False
,
help
=
"If True, print the names of all the tensors."
)
parser
.
add_argument
(
"--printoptions"
,
nargs
=
"*"
,
type
=
parse_numpy_printoption
,
help
=
"Argument for numpy.set_printoptions(), in the form 'k=v'."
)
FLAGS
,
unparsed
=
parser
.
parse_known_args
()
app
.
run
(
main
=
main
,
argv
=
[
sys
.
argv
[
0
]]
+
unparsed
)
\ No newline at end of file
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/mask-rcnn/.gitkeep
0 → 100644
View file @
c320b6ef
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/mask-rcnn/1555659850/.gitkeep
0 → 100644
View file @
c320b6ef
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/pb_to_ckpt.py
0 → 100644
View file @
c320b6ef
#! /usr/bin/python
# -*- coding: utf-8 -*-
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
os
import
argparse
import
logging
import
tensorflow
as
tf
# Pass the filename as an argument
parser
=
argparse
.
ArgumentParser
()
parser
.
add_argument
(
"--frozen_model_filename"
,
default
=
"/path-to-pb-file/Binary_Protobuf.pb"
,
type
=
str
,
help
=
"Pb model file to import"
)
parser
.
add_argument
(
"--output_filename"
,
default
=
"/path-to-ckpt-file/model.ckpt"
,
type
=
str
,
help
=
"Pb model file to import"
)
args
=
parser
.
parse_args
()
if
__name__
==
"__main__"
:
logging
.
disable
(
logging
.
WARNING
)
os
.
environ
[
"TF_CPP_MIN_LOG_LEVEL"
]
=
"3"
with
tf
.
compat
.
v1
.
Session
(
graph
=
tf
.
Graph
())
as
sess
:
tf
.
compat
.
v1
.
saved_model
.
loader
.
load
(
sess
,
[
tf
.
saved_model
.
SERVING
],
args
.
frozen_model_filename
)
saver
=
tf
.
compat
.
v1
.
train
.
Saver
()
save_path
=
saver
.
save
(
sess
,
args
.
output_filename
)
print
(
"Model saved to ckpt format"
)
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/resnet/.gitkeep
0 → 100644
View file @
c320b6ef
TensorFlow2x/ComputeVision/Detection/MaskRCNN/weights/resnet/extracted_from_maskrcnn/.gitkeep
0 → 100644
View file @
c320b6ef
Prev
1
2
3
4
5
6
7
8
9
10
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment