START TIME: Fri Oct 27 11:39:55 CST 2023 b17r3n15 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================0 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================2 WORLD_SIZE*************96 LRANK===============================2 LRANK===============================1 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================0 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================2 WORLD_SIZE*************96 LRANK===============================2 LRANK===============================2 LRANK===============================2 LRANK===============================2 LRANK===============================2 WORLD_SIZE*************96 LRANK===============================2 WORLD_SIZE*************96 LRANK===============================2 LRANK===============================2 LRANK===============================2 LRANK===============================2 LRANK===============================2 LRANK===============================2 LRANK===============================2 LRANK===============================1 WORLD_SIZE*************96 LRANK===============================3 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 LRANK===============================2 WORLD_SIZE*************96 LRANK===============================2 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 LRANK===============================2 LRANK===============================2 LRANK===============================2 LRANK===============================2 WORLD_SIZE*************96 LRANK===============================2 LRANK===============================3 WORLD_SIZE*************96 LRANK===============================3 LRANK===============================3 LRANK===============================3 LRANK===============================3 LRANK===============================3 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 LRANK===============================2 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 LRANK===============================3 LRANK===============================3 WORLD_SIZE*************96 LRANK===============================3 LRANK===============================3 LRANK===============================3 LRANK===============================3 LRANK===============================3 LRANK===============================3 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 LRANK===============================3 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 LRANK===============================3 LRANK===============================3 LRANK===============================3 LRANK===============================3 LRANK===============================3 LRANK===============================3 LRANK===============================3 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 WORLD_SIZE*************96 LRANK===============================3 WORLD_SIZE*************96 LRANK===============================1 WORLD_SIZE*************96 [2023-10-27 11:40:51,867] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,890] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,886] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,888] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,891] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,890] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,867] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,889] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,875] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,889] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,885] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,886] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,886] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,892] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,885] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,886] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,886] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,890] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,892] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,869] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,882] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,888] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,878] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,891] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,891] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,868] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,889] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,876] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,890] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,886] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,886] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,886] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,893] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,886] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,886] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,888] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,890] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,892] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,869] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,882] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,888] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,878] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,891] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,893] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,868] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,889] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,877] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,891] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,888] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,894] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,888] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,888] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,890] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,888] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,892] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,869] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,882] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,889] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,878] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,894] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,892] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,878] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,891] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,888] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,889] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,889] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,894] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,889] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,891] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,888] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,893] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,869] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,887] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,883] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,889] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:51,878] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:59,475] [INFO] [comm.py:606:init_distributed] Not using the DeepSpeed or dist launchers, attempting to detect MPI environment... [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=0, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-10-27 11:40:59,691] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=20, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,698] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=32, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,674] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=24, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=92, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,696] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=48, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,698] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=56, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=4, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=80, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,698] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=44, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=60, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=1, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=21, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,698] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=33, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,675] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=25, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=93, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,696] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=49, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,698] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=57, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=5, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=81, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,698] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=45, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=61, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=72, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=16, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,696] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=88, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,697] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=68, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,687] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=40, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,686] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=65, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,691] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=29, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,686] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=85, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,698] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=37, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,698] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=53, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=77, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,674] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=9, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=73, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=17, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,696] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=89, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,697] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=69, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,687] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=41, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,686] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=64, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,691] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=28, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,686] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=84, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,698] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=36, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,698] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=52, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=76, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,674] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=8, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=13, local_rank=1, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,692] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=12, local_rank=0, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=2, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,694] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=94, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=78, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,697] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=50, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,676] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=26, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,699] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=58, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,694] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=74, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,700] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=38, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,699] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=70, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=82, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=3, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,699] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=71, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=62, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,700] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=46, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,687] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=66, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=22, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,675] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=10, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,694] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=6, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,688] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=42, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,687] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=86, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,697] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=90, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,700] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=54, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=30, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,694] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=18, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,699] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=34, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=14, local_rank=2, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=63, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,700] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=47, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,688] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=67, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=23, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,675] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=11, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,694] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=7, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,688] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=43, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,688] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=87, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,697] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=91, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,700] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=55, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=31, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,694] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=19, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,700] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=35, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=15, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,697] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=51, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,694] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=75, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,700] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=39, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,676] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=27, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=79, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,693] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=83, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,700] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=59, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 [2023-10-27 11:40:59,694] [INFO] [comm.py:656:mpi_discovery] Discovered MPI settings of world_rank=95, local_rank=3, world_size=96, master_addr=10.2.17.56, master_port=29500 b17r3n15:29770:29770 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.56<0> b17r3n15:29770:29770 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n15:29770:29770 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation RCCL version 2.13.4+hip5.4 HEAD:82f11f7 b17r3n15:29770:31277 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.56<0> b17r3n15:29770:31277 [0] NCCL INFO Using network IB b17r3n15:29772:29772 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.56<0> b17r3n15:29772:29772 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n15:29772:29772 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n15:29771:29771 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.56<0> b17r3n15:29771:29771 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n15:29771:29771 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n15:29773:29773 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.56<0> b17r3n15:29773:29773 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n15:29773:29773 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n05:16147:16147 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.66<0> b17r4n05:16147:16147 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n13:19752:19752 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.74<0> b17r4n13:19752:19752 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n04:15593:15593 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.65<0> b17r4n04:15593:15593 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n04:15591:15591 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.65<0> b17r4n04:15591:15591 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n08:18005:18005 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.69<0> b17r4n08:18005:18005 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n13:19749:19749 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.74<0> b17r4n13:19749:19749 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n16:21053:21053 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.57<0> b17r3n16:21053:21053 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n18:5554:5554 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.79<0> b17r4n18:5554:5554 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n07:16695:16695 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.68<0> b17r4n07:16695:16695 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n18:5553:5553 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.79<0> b17r4n14:15272:15272 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.75<0> b17r4n14:15272:15272 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n18:5553:5553 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n17:16113:16113 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.78<0> b17r4n17:16113:16113 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n05:16147:16147 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n02:5834:5834 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.63<0> b17r4n02:5834:5834 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n17:16115:16115 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.78<0> b17r4n13:19752:19752 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n17:16115:16115 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n17:6032:6032 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.58<0> b17r3n17:6032:6032 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n17:6029:6029 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.58<0> b17r3n17:6029:6029 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n13:19749:19749 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n14:15273:15273 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.75<0> b17r4n14:15273:15273 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n13:19750:19750 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.74<0> b17r4n13:19750:19750 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n07:16692:16692 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.68<0> b17r4n07:16693:16693 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.68<0> b17r4n07:16693:16693 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n07:16692:16692 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n08:18006:18006 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.69<0> b17r4n05:16145:16145 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.66<0> b17r4n04:15593:15593 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n08:18006:18006 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n05:16148:16148 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.66<0> b17r4n18:5555:5555 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.79<0> b17r4n05:16145:16145 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n18:5555:5555 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n12:2010:2010 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.73<0> b17r4n12:2010:2010 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n05:16148:16148 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n13:19751:19751 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.74<0> b17r4n13:19751:19751 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n04:15591:15591 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n06:14885:14885 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.67<0> b17r4n06:14885:14885 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n13:19750:19750 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n08:18004:18004 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.69<0> b17r4n05:16145:16145 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n05:16148:16148 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n09:16265:16265 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.70<0> b17r4n09:16265:16265 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n08:18004:18004 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n16:14826:14826 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.77<0> b17r4n16:14826:14826 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n02:5833:5833 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.63<0> b17r4n02:5833:5833 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n13:19751:19751 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n17:6030:6030 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.58<0> b17r4n08:18005:18005 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n17:6030:6030 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n17:6031:6031 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.58<0> b17r3n17:6031:6031 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n00:1874:1874 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.61<0> b17r4n00:1874:1874 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n08:18006:18006 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n08:18003:18003 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.69<0> b17r4n06:14884:14884 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.67<0> b17r4n08:18003:18003 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n06:14884:14884 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n12:2009:2009 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.73<0> b17r4n12:2012:2012 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.73<0> b17r4n12:2012:2012 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n12:2009:2009 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n18:5556:5556 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.79<0> b17r4n08:18004:18004 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n18:5556:5556 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n05:16146:16146 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.66<0> b17r4n05:16146:16146 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n16:21053:21053 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n08:18003:18003 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n18:5554:5554 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n18:5553:5553 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n06:14883:14883 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.67<0> b17r4n06:14883:14883 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n05:16146:16146 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n14:15272:15272 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n17:16115:16115 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n18:5555:5555 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n02:5835:5835 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.63<0> b17r4n02:5835:5835 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n07:16695:16695 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n09:16266:16266 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.70<0> b17r4n14:15273:15273 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n09:16266:16266 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n17:16113:16113 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n18:5556:5556 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n16:14827:14827 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.77<0> b17r4n07:16692:16692 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n16:14827:14827 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n09:16264:16264 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.70<0> b17r4n09:16264:16264 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n06:14882:14882 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.67<0> b17r4n06:14882:14882 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n02:5836:5836 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.63<0> b17r4n02:5836:5836 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n00:1875:1875 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.61<0> b17r4n07:16693:16693 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n00:1875:1875 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n16:21054:21054 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.57<0> b17r3n16:21052:21052 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.57<0> b17r3n16:21054:21054 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n04:15592:15592 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.65<0> b17r4n04:15592:15592 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n16:21052:21052 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n07:16694:16694 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.68<0> b17r4n16:14825:14825 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.77<0> b17r4n07:16694:16694 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n16:14825:14825 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n15:5795:5795 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.76<0> b17r4n15:5795:5795 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n14:15274:15274 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.75<0> b17r4n14:15274:15274 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n02:5834:5834 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n16:21052:21052 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n16:21054:21054 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n04:15592:15592 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n04:15590:15590 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.65<0> b17r4n12:2011:2011 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.73<0> b17r4n04:15590:15590 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n07:16694:16694 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n12:2011:2011 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n17:6032:6032 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n14:15275:15275 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.75<0> b17r4n14:15275:15275 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n14:15274:15274 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n02:5833:5833 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n17:6029:6029 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n04:15590:15590 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n14:15275:15275 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n17:6030:6030 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n02:5835:5835 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n15:5794:5794 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.76<0> b17r4n15:5794:5794 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n17:6031:6031 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n02:5836:5836 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n12:2010:2010 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n12:2012:2012 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n06:14884:14884 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n06:14885:14885 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n09:16266:16266 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n09:16264:16264 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n00:1873:1873 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.61<0> b17r4n00:1873:1873 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n12:2009:2009 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n06:14883:14883 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n17:16112:16112 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.78<0> b17r4n09:16265:16265 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n16:14826:14826 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n17:16114:16114 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.78<0> b17r4n17:16112:16112 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n17:16114:16114 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n00:1874:1874 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n00:1875:1875 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n00:1873:1873 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n06:14882:14882 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n12:2011:2011 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n16:14827:14827 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n17:16112:16112 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n17:16114:16114 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n16:14824:14824 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.77<0> b17r4n16:14824:14824 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n16:14825:14825 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n00:1872:1872 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.61<0> b17r4n00:1872:1872 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n16:14824:14824 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n09:16263:16263 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.70<0> b17r4n09:16263:16263 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n00:1872:1872 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n03:19959:19959 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.64<0> b17r4n03:19959:19959 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n09:16263:16263 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n15:5795:5795 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n15:5794:5794 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n15:5793:5793 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.76<0> b17r4n03:19960:19960 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.64<0> b17r4n03:19960:19960 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n15:5793:5793 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n15:5796:5796 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.76<0> b17r4n15:5796:5796 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n15:5793:5793 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n15:5796:5796 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n03:19957:19957 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.64<0> b17r4n03:19958:19958 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.64<0> b17r4n03:19957:19957 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n03:19958:19958 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n03:19959:19959 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n03:19960:19960 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n03:19958:19958 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n03:19957:19957 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n16:21051:21051 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.57<0> b17r3n16:21051:21051 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n16:21051:21051 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n10:22360:22360 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.71<0> b17r4n10:22360:22360 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n10:22361:22361 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.71<0> b17r4n10:22361:22361 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n10:22360:22360 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n10:22361:22361 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n10:22358:22358 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.71<0> b17r4n10:22358:22358 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n10:22358:22358 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r4n10:22359:22359 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.71<0> b17r4n10:22359:22359 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r4n10:22359:22359 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n15:29771:31477 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.56<0> b17r3n15:29771:31477 [1] NCCL INFO Using network IB b17r3n15:29772:31478 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.56<0> b17r3n15:29772:31478 [2] NCCL INFO Using network IB b17r3n15:29773:31483 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.56<0> b17r3n15:29773:31483 [3] NCCL INFO Using network IB b17r4n05:16148:17553 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.66<0> b17r4n05:16148:17553 [2] NCCL INFO Using network IB b17r4n05:16145:17555 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.66<0> b17r4n05:16145:17555 [1] NCCL INFO Using network IB b17r4n05:16146:17556 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.66<0> b17r4n05:16146:17556 [3] NCCL INFO Using network IB b17r4n05:16147:17546 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.66<0> b17r4n05:16147:17546 [0] NCCL INFO Using network IB b17r4n13:19752:21189 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.74<0> b17r4n13:19752:21189 [0] NCCL INFO Using network IB b17r4n13:19750:21192 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.74<0> b17r4n13:19750:21192 [2] NCCL INFO Using network IB b17r4n13:19749:21187 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.74<0> b17r4n13:19749:21187 [1] NCCL INFO Using network IB b17r4n13:19751:21190 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.74<0> b17r4n13:19751:21190 [3] NCCL INFO Using network IB b17r4n14:15272:16709 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.75<0> b17r4n14:15272:16709 [1] NCCL INFO Using network IB b17r4n14:15275:16716 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.75<0> b17r4n14:15275:16716 [3] NCCL INFO Using network IB b17r4n14:15273:16711 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.75<0> b17r4n14:15273:16711 [0] NCCL INFO Using network IB b17r4n14:15274:16717 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.75<0> b17r4n14:15274:16717 [2] NCCL INFO Using network IB b17r4n08:18006:19383 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.69<0> b17r4n08:18006:19383 [3] NCCL INFO Using network IB b17r4n08:18003:19384 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.69<0> b17r4n08:18003:19384 [0] NCCL INFO Using network IB b17r4n08:18004:19385 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.69<0> b17r4n08:18004:19385 [1] NCCL INFO Using network IB b17r4n08:18005:19379 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.69<0> b17r4n08:18005:19379 [2] NCCL INFO Using network IB b17r4n04:15592:16961 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.65<0> b17r4n04:15592:16961 [1] NCCL INFO Using network IB b17r4n04:15590:16962 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.65<0> b17r4n04:15590:16962 [3] NCCL INFO Using network IB b17r4n04:15591:16957 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.65<0> b17r4n04:15591:16957 [0] NCCL INFO Using network IB b17r4n04:15593:16956 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.65<0> b17r4n04:15593:16956 [2] NCCL INFO Using network IB b17r3n16:21052:22531 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.57<0> b17r3n16:21052:22531 [3] NCCL INFO Using network IB b17r3n16:21054:22535 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.57<0> b17r3n16:21054:22535 [2] NCCL INFO Using network IB b17r3n16:21053:22526 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.57<0> b17r3n16:21053:22526 [0] NCCL INFO Using network IB b17r4n07:16692:17986 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.68<0> b17r4n07:16692:17986 [3] NCCL INFO Using network IB b17r4n07:16694:17987 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.68<0> b17r4n07:16694:17987 [1] NCCL INFO Using network IB b17r4n07:16695:17983 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.68<0> b17r4n07:16695:17983 [0] NCCL INFO Using network IB b17r4n07:16693:17982 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.68<0> b17r4n07:16693:17982 [2] NCCL INFO Using network IB b17r4n17:16113:17593 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.78<0> b17r4n17:16113:17593 [0] NCCL INFO Using network IB b17r4n17:16115:17594 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.78<0> b17r4n17:16115:17594 [1] NCCL INFO Using network IB b17r4n17:16114:17602 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.78<0> b17r4n17:16114:17602 [3] NCCL INFO Using network IB b17r4n17:16112:17599 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.78<0> b17r4n17:16112:17599 [2] NCCL INFO Using network IB b17r4n18:5554:6818 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.79<0> b17r4n18:5554:6818 [0] NCCL INFO Using network IB b17r4n18:5553:6820 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.79<0> b17r4n18:5553:6820 [1] NCCL INFO Using network IB b17r4n18:5556:6821 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.79<0> b17r4n18:5556:6821 [3] NCCL INFO Using network IB b17r4n18:5555:6817 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.79<0> b17r4n18:5555:6817 [2] NCCL INFO Using network IB b17r3n17:6030:7475 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.58<0> b17r3n17:6030:7475 [3] NCCL INFO Using network IB b17r3n17:6032:7472 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.58<0> b17r3n17:6032:7472 [0] NCCL INFO Using network IB b17r3n17:6031:7471 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.58<0> b17r3n17:6031:7471 [1] NCCL INFO Using network IB b17r3n17:6029:7476 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.58<0> b17r3n17:6029:7476 [2] NCCL INFO Using network IB b17r4n02:5833:7113 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.63<0> b17r4n02:5833:7113 [1] NCCL INFO Using network IB b17r4n02:5835:7111 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.63<0> b17r4n02:5835:7111 [2] NCCL INFO Using network IB b17r4n02:5834:7112 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.63<0> b17r4n02:5834:7112 [0] NCCL INFO Using network IB b17r4n12:2009:3188 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.73<0> b17r4n12:2009:3188 [1] NCCL INFO Using network IB b17r4n02:5836:7114 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.63<0> b17r4n02:5836:7114 [3] NCCL INFO Using network IB b17r4n12:2012:3189 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.73<0> b17r4n12:2012:3189 [2] NCCL INFO Using network IB b17r4n12:2010:3191 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.73<0> b17r4n12:2010:3191 [0] NCCL INFO Using network IB b17r4n12:2011:3187 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.73<0> b17r4n12:2011:3187 [3] NCCL INFO Using network IB b17r4n06:14885:16285 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.67<0> b17r4n06:14885:16285 [0] NCCL INFO Using network IB b17r4n06:14884:16288 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.67<0> b17r4n06:14884:16288 [1] NCCL INFO Using network IB b17r4n06:14882:16289 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.67<0> b17r4n06:14882:16289 [2] NCCL INFO Using network IB b17r4n06:14883:16284 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.67<0> b17r4n06:14883:16284 [3] NCCL INFO Using network IB b17r4n16:14825:16209 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.77<0> b17r4n16:14825:16209 [0] NCCL INFO Using network IB b17r4n16:14824:16211 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.77<0> b17r4n16:14824:16211 [3] NCCL INFO Using network IB b17r4n16:14826:16210 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.77<0> b17r4n16:14826:16210 [1] NCCL INFO Using network IB b17r4n16:14827:16212 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.77<0> b17r4n16:14827:16212 [2] NCCL INFO Using network IB b17r4n00:1875:3314 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.61<0> b17r4n00:1875:3314 [2] NCCL INFO Using network IB b17r4n00:1872:3313 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.61<0> b17r4n00:1872:3313 [0] NCCL INFO Using network IB b17r4n00:1874:3315 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.61<0> b17r4n00:1874:3315 [1] NCCL INFO Using network IB b17r4n00:1873:3317 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.61<0> b17r4n00:1873:3317 [3] NCCL INFO Using network IB b17r4n09:16265:17674 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.70<0> b17r4n09:16265:17674 [1] NCCL INFO Using network IB b17r4n09:16264:17677 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.70<0> b17r4n09:16264:17677 [0] NCCL INFO Using network IB b17r4n09:16263:17675 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.70<0> b17r4n09:16263:17675 [3] NCCL INFO Using network IB b17r4n09:16266:17676 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.70<0> b17r4n09:16266:17676 [2] NCCL INFO Using network IB b17r4n15:5794:6948 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.76<0> b17r4n15:5794:6948 [0] NCCL INFO Using network IB b17r4n15:5795:6947 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.76<0> b17r4n15:5795:6947 [1] NCCL INFO Using network IB b17r4n15:5793:6951 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.76<0> b17r4n15:5793:6951 [2] NCCL INFO Using network IB b17r4n15:5796:6949 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.76<0> b17r4n15:5796:6949 [3] NCCL INFO Using network IB b17r3n16:21051:22546 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.57<0> b17r3n16:21051:22546 [1] NCCL INFO Using network IB b17r4n03:19959:21381 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.64<0> b17r4n03:19959:21381 [0] NCCL INFO Using network IB b17r4n03:19960:21384 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.64<0> b17r4n03:19960:21384 [1] NCCL INFO Using network IB b17r4n03:19957:21385 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.64<0> b17r4n03:19957:21385 [2] NCCL INFO Using network IB b17r4n03:19958:21383 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.64<0> b17r4n03:19958:21383 [3] NCCL INFO Using network IB b17r4n10:22360:23820 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.71<0> b17r4n10:22360:23820 [0] NCCL INFO Using network IB b17r4n10:22359:23823 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.71<0> b17r4n10:22359:23823 [2] NCCL INFO Using network IB b17r4n10:22361:23819 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.71<0> b17r4n10:22361:23819 [1] NCCL INFO Using network IB b17r4n10:22358:23822 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.71<0> b17r4n10:22358:23822 [3] NCCL INFO Using network IB b17r3n18:12120:12120 [0] NCCL INFO Bootstrap : Using ib0:11.2.17.59<0> b17r3n18:12120:12120 [0] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n18:12118:12118 [3] NCCL INFO Bootstrap : Using ib0:11.2.17.59<0> b17r3n18:12118:12118 [3] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n18:12120:12120 [0] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n18:12118:12118 [3] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n18:12119:12119 [1] NCCL INFO Bootstrap : Using ib0:11.2.17.59<0> b17r3n18:12119:12119 [1] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n18:12119:12119 [1] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n18:12117:12117 [2] NCCL INFO Bootstrap : Using ib0:11.2.17.59<0> b17r3n18:12117:12117 [2] NCCL INFO Plugin name set by env to librccl-net-none.so b17r3n18:12117:12117 [2] NCCL INFO NET/Plugin : No plugin found (librccl-net-none.so), using internal implementation b17r3n18:12118:13578 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.59<0> b17r3n18:12118:13578 [3] NCCL INFO Using network IB b17r3n18:12119:13584 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.59<0> b17r3n18:12119:13584 [1] NCCL INFO Using network IB b17r3n18:12117:13585 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.59<0> b17r3n18:12117:13585 [2] NCCL INFO Using network IB b17r3n18:12120:13577 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [RO]; OOB ib0:11.2.17.59<0> b17r3n18:12120:13577 [0] NCCL INFO Using network IB