Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
apex
Commits
15498555
Commit
15498555
authored
Nov 19, 2021
by
Hubert Lu
Browse files
Add unit tests for Apex extensions and distributed Apex
parent
f3868524
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
41 additions
and
7 deletions
+41
-7
apex/contrib/test/run_rocm_extensions.py
apex/contrib/test/run_rocm_extensions.py
+27
-0
tests/distributed/run_rocm_distributed.sh
tests/distributed/run_rocm_distributed.sh
+14
-7
No files found.
apex/contrib/test/run_rocm_extensions.py
0 → 100644
View file @
15498555
import
unittest
import
sys
test_dirs
=
[
"groupbn"
,
"layer_norm"
,
"multihead_attn"
,
"."
]
# "." for test_label_smoothing.py
ROCM_BLACKLIST
=
[
"groupbn"
,
"layer_norm"
]
runner
=
unittest
.
TextTestRunner
(
verbosity
=
2
)
errcode
=
0
for
test_dir
in
test_dirs
:
if
test_dir
in
ROCM_BLACKLIST
:
continue
suite
=
unittest
.
TestLoader
().
discover
(
test_dir
)
print
(
"
\n
Executing tests from "
+
test_dir
)
result
=
runner
.
run
(
suite
)
if
not
result
.
wasSuccessful
():
errcode
=
1
sys
.
exit
(
errcode
)
tests/distributed/run_rocm_distributed.sh
View file @
15498555
...
...
@@ -6,8 +6,8 @@ export WORLD_SIZE=2
# Test with opt_level="O2"
echo
"running opt_level O2"
python
3.6
-m
torch.distributed.launch
--nproc_per_node
=
2 amp_master_params/amp_master_params.py
--opt_level
"O2"
python
3.6
amp_master_params/compare.py
python
-m
torch.distributed.launch
--nproc_per_node
=
2 amp_master_params/amp_master_params.py
--opt_level
"O2"
python amp_master_params/compare.py
# delete the model files
echo
-e
"O2 test completed. Deleting model files
\n
"
...
...
@@ -19,9 +19,9 @@ rm rank1master.pth
# Test with opt_level="O5"
#echo "running opt_level O5"
#python
3.6
-m torch.distributed.launch --nproc_per_node=2 amp_master_params/amp_master_params.py --opt_level "O5"
#python
3.6
amp_master_params/compare.py
#
#python -m torch.distributed.launch --nproc_per_node=2 amp_master_params/amp_master_params.py --opt_level "O5"
#python amp_master_params/compare.py
## delete the model files
#echo "O5 test completed. Deleting model files"
#rm rank0model.pth
...
...
@@ -31,7 +31,14 @@ rm rank1master.pth
## Run the Sync BN Tests.
echo
"Running syncbn tests"
python3.6
-m
torch.distributed.launch
--nproc_per_node
=
2 synced_batchnorm/two_gpu_test_different_batch_size.py
--apex
python
-m
torch.distributed.launch
--nproc_per_node
=
2 synced_batchnorm/two_gpu_unit_test.py
python
-m
torch.distributed.launch
--nproc_per_node
=
2 synced_batchnorm/two_gpu_unit_test.py
--fp16
python
-m
torch.distributed.launch
--nproc_per_node
=
2 synced_batchnorm/two_gpu_test_different_batch_size.py
--apex
echo
"Running syncbn python only tests"
python3.6 synced_batchnorm/python_single_gpu_unit_test.py
python synced_batchnorm/python_single_gpu_unit_test.py
echo
"Running syncbn batchnorm1d tests"
python synced_batchnorm/test_batchnorm1d.py
## Run the DDP Tests
echo
"running DDP tests"
HIP_VISIBLE_DEVICES
=
0,1 python
-m
torch.distributed.launch
--nproc_per_node
=
2 DDP/ddp_race_condition_test.py
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment