Commit 10c828ed authored by Jared Casper's avatar Jared Casper
Browse files

Merge branch 'check_mpu_init' into 'main'

fix for nemo: do not initialize mpu if it is already initialized

See merge request ADLR/megatron-lm!141
parents c63906a6 6e433055
...@@ -123,6 +123,9 @@ def _initialize_distributed(): ...@@ -123,6 +123,9 @@ def _initialize_distributed():
# Set the model-parallel / data-parallel communicators. # Set the model-parallel / data-parallel communicators.
if device_count > 0: if device_count > 0:
if mpu.model_parallel_is_initialized():
print('model parallel is already initialized')
else:
mpu.initialize_model_parallel(args.model_parallel_size) mpu.initialize_model_parallel(args.model_parallel_size)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment