============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=606889206 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 error ==================================== ERRORS ==================================== __________________ ERROR collecting lightning/test_simple.py ___________________ ImportError while importing test module '/home/aishsh/ds-v0.9.2/tests/lightning/test_simple.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /usr/local/lib/python3.7/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) lightning/test_simple.py:7: in from pytorch_lightning import LightningModule, Trainer E ModuleNotFoundError: No module named 'pytorch_lightning' =========================== short test summary info ============================ ERROR lightning/test_simple.py !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! =============================== 1 error in 1.23s =============================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=633467180 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 error ==================================== ERRORS ==================================== ____________ ERROR collecting model/BingBertSquad/test_e2e_squad.py ____________ ImportError while importing test module '/home/aishsh/ds-v0.9.2/tests/model/BingBertSquad/test_e2e_squad.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /usr/local/lib/python3.7/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) model/BingBertSquad/test_e2e_squad.py:14: in import evaluate as eval E ModuleNotFoundError: No module named 'evaluate' =========================== short test summary info ============================ ERROR model/BingBertSquad/test_e2e_squad.py !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! =============================== 1 error in 1.23s =============================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1594543543 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items ============================ no tests ran in 0.90s ============================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1727951244 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 error ==================================== ERRORS ==================================== _________________ ERROR collecting onebit/test_nccl_backend.py _________________ ImportError while importing test module '/home/aishsh/ds-v0.9.2/tests/onebit/test_nccl_backend.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /usr/local/lib/python3.7/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) onebit/test_nccl_backend.py:13: in from deepspeed.runtime.comm.nccl import NcclBackend /usr/local/lib/python3.7/site-packages/deepspeed/runtime/comm/nccl.py:8: in import cupy E ModuleNotFoundError: No module named 'cupy' =========================== short test summary info ============================ ERROR onebit/test_nccl_backend.py !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! =============================== 1 error in 1.24s =============================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2547395270 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 error ==================================== ERRORS ==================================== _________________ ERROR collecting onebit/test_mpi_backend.py __________________ ImportError while importing test module '/home/aishsh/ds-v0.9.2/tests/onebit/test_mpi_backend.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /usr/local/lib/python3.7/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) onebit/test_mpi_backend.py:12: in from deepspeed.runtime.comm.mpi import MpiBackend /usr/local/lib/python3.7/site-packages/deepspeed/runtime/comm/mpi.py:7: in import cupy E ModuleNotFoundError: No module named 'cupy' ------------------------------- Captured stderr -------------------------------- -------------------------------------------------------------------------- No OpenFabrics connection schemes reported that they were able to be used on a specific port. As such, the openib BTL (OpenFabrics support) will be disabled for this port. Local host: 26388537c721 Local device: mlx5_0 Local port: 1 CPCs attempted: rdmacm, udcm -------------------------------------------------------------------------- =========================== short test summary info ============================ ERROR onebit/test_mpi_backend.py !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! =============================== 1 error in 1.44s =============================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2302194531 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 error ==================================== ERRORS ==================================== ___________________ ERROR collecting onebit/test_mpi_perf.py ___________________ ImportError while importing test module '/home/aishsh/ds-v0.9.2/tests/onebit/test_mpi_perf.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /usr/local/lib/python3.7/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) onebit/test_mpi_perf.py:10: in from deepspeed.runtime.comm.mpi import MpiBackend /usr/local/lib/python3.7/site-packages/deepspeed/runtime/comm/mpi.py:7: in import cupy E ModuleNotFoundError: No module named 'cupy' ------------------------------- Captured stderr -------------------------------- -------------------------------------------------------------------------- No OpenFabrics connection schemes reported that they were able to be used on a specific port. As such, the openib BTL (OpenFabrics support) will be disabled for this port. Local host: 26388537c721 Local device: mlx5_0 Local port: 1 CPCs attempted: rdmacm, udcm -------------------------------------------------------------------------- =========================== short test summary info ============================ ERROR onebit/test_mpi_perf.py !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! =============================== 1 error in 1.44s =============================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=238941418 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 error ==================================== ERRORS ==================================== __________________ ERROR collecting onebit/test_nccl_perf.py ___________________ ImportError while importing test module '/home/aishsh/ds-v0.9.2/tests/onebit/test_nccl_perf.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /usr/local/lib/python3.7/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) onebit/test_nccl_perf.py:13: in from deepspeed.runtime.comm.nccl import NcclBackend /usr/local/lib/python3.7/site-packages/deepspeed/runtime/comm/nccl.py:8: in import cupy E ModuleNotFoundError: No module named 'cupy' =========================== short test summary info ============================ ERROR onebit/test_nccl_perf.py !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! =============================== 1 error in 1.22s =============================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=319756802 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 error ==================================== ERRORS ==================================== __________ ERROR collecting small_model_debugging/test_mics_config.py __________ small_model_debugging/test_mics_config.py:69: in rank = int(os.environ['RANK']) /usr/local/lib/python3.7/os.py:681: in __getitem__ raise KeyError(key) from None E KeyError: 'RANK' =========================== short test summary info ============================ ERROR small_model_debugging/test_mics_config.py - KeyError: 'RANK' !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! =============================== 1 error in 1.30s =============================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2887553514 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 error ==================================== ERRORS ==================================== _____________ ERROR collecting small_model_debugging/test_model.py _____________ small_model_debugging/test_model.py:66: in rank = int(os.environ['RANK']) /usr/local/lib/python3.7/os.py:681: in __getitem__ raise KeyError(key) from None E KeyError: 'RANK' =========================== short test summary info ============================ ERROR small_model_debugging/test_model.py - KeyError: 'RANK' !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! =============================== 1 error in 1.31s =============================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=4135786312 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 10 items unit/autotuning/test_autotuning.py::test_resource_manager_arg_mappings[arg_mappings2] PASSED [ 10%] unit/autotuning/test_autotuning.py::test_resource_manager_arg_mappings[arg_mappings3] PASSED [ 20%] unit/autotuning/test_autotuning.py::test_autotuner_resources[active_resources1] PASSED [ 30%] unit/autotuning/test_autotuning.py::test_command_line PASSED [ 40%] unit/autotuning/test_autotuning.py::test_autotuner_resources[active_resources3] PASSED [ 50%] unit/autotuning/test_autotuning.py::test_resource_manager_arg_mappings[arg_mappings4] PASSED [ 60%] unit/autotuning/test_autotuning.py::test_resource_manager_arg_mappings[arg_mappings1] PASSED [ 70%] unit/autotuning/test_autotuning.py::test_autotuner_resources[active_resources2] PASSED [ 80%] unit/autotuning/test_autotuning.py::test_autotuner_resources[active_resources0] PASSED [ 90%] unit/autotuning/test_autotuning.py::test_resource_manager_arg_mappings[None] PASSED [100%] =============================== warnings summary =============================== unit/autotuning/test_autotuning.py::test_resource_manager_arg_mappings[arg_mappings2] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/autotuning/test_autotuning.py::test_resource_manager_arg_mappings[arg_mappings2] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (30 durations < 1s hidden. Use -vv to show these durations.) ======================== 10 passed, 2 warnings in 0.97s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=169367059 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 22 items unit/comm/test_dist.py::TestGroupedDistTest::test_two[1138] PASSED [ 4%] unit/comm/test_dist.py::TestGroupedDistTest::test_one[1138] PASSED [ 9%] unit/comm/test_dist.py::TestDistributedFixture::test[2-16] PASSED [ 13%] unit/comm/test_dist.py::TestDistributedFixture::test[4-32] PASSED [ 18%] unit/comm/test_dist.py::TestDistributedFixture::test[4-16] PASSED [ 22%] unit/comm/test_dist.py::TestDistributedFixture::test[2-32] PASSED [ 27%] unit/comm/test_dist.py::TestInit::test PASSED [ 31%] unit/comm/test_dist.py::TestDistInitWithModel::test_no_init[True] PASSED [ 36%] unit/comm/test_dist.py::TestDistInitWithModel::test_no_init[False] PASSED [ 40%] unit/comm/test_dist.py::TestDistInitWithModel::test_already_init[True] PASSED [ 45%] unit/comm/test_dist.py::TestDistInitWithModel::test_already_init[False] PASSED [ 50%] unit/comm/test_dist.py::TestDistInit::test_already_init[True] PASSED [ 54%] unit/comm/test_dist.py::TestDistInit::test_already_init[False] PASSED [ 59%] unit/comm/test_dist.py::TestDistInit::test_already_init[None] PASSED [ 63%] unit/comm/test_dist.py::TestDistInit::test_no_init[True] PASSED [ 68%] unit/comm/test_dist.py::TestDistInit::test_no_init[False] PASSED [ 72%] unit/comm/test_dist.py::TestDistInit::test_no_init[None] PASSED [ 77%] unit/comm/test_dist.py::TestDistAllReduce::test PASSED [ 81%] unit/comm/test_dist.py::TestDistInitNoEnv::test PASSED [ 86%] unit/comm/test_dist.py::TestDistArgs::test[hello-icosahedron-1138-purple] PASSED [ 90%] unit/comm/test_dist.py::TestWorldSizeOverrideDistTest::test_world_size_2 PASSED [ 95%] unit/comm/test_dist.py::TestWorldSizeOverrideDistTest::test_world_size_1 PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 unit/comm/test_dist.py::TestDistributedFixture::test[2-16] unit/comm/test_dist.py::TestDistributedFixture::test[2-16] unit/comm/test_dist.py::TestDistributedFixture::test[4-32] unit/comm/test_dist.py::TestDistributedFixture::test[4-32] unit/comm/test_dist.py::TestDistributedFixture::test[4-16] unit/comm/test_dist.py::TestDistributedFixture::test[4-16] unit/comm/test_dist.py::TestDistributedFixture::test[2-32] unit/comm/test_dist.py::TestDistributedFixture::test[2-32] /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/comm/test_dist.py:70 /home/aishsh/ds-v0.9.2/tests/unit/comm/test_dist.py:70: PytestUnknownMarkWarning: Unknown pytest.mark.world_size - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html @pytest.mark.world_size(1) unit/comm/test_dist.py::TestGroupedDistTest::test_two[1138] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/comm/test_dist.py::TestGroupedDistTest::test_two[1138] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 13.64s call unit/comm/test_dist.py::TestDistAllReduce::test 6.02s call unit/comm/test_dist.py::TestDistInitWithModel::test_already_init[False] 5.22s call unit/comm/test_dist.py::TestWorldSizeOverrideDistTest::test_world_size_2 5.22s call unit/comm/test_dist.py::TestDistInit::test_already_init[True] 5.21s call unit/comm/test_dist.py::TestGroupedDistTest::test_one[1138] 5.12s call unit/comm/test_dist.py::TestDistInit::test_already_init[None] 5.11s call unit/comm/test_dist.py::TestDistInitWithModel::test_no_init[True] 5.11s setup unit/comm/test_dist.py::TestDistributedFixture::test[4-32] 5.11s call unit/comm/test_dist.py::TestDistInitWithModel::test_already_init[True] 4.77s call unit/comm/test_dist.py::TestGroupedDistTest::test_two[1138] 4.41s call unit/comm/test_dist.py::TestInit::test 4.32s call unit/comm/test_dist.py::TestDistInit::test_already_init[False] 4.31s setup unit/comm/test_dist.py::TestDistributedFixture::test[2-32] 4.31s call unit/comm/test_dist.py::TestDistInit::test_no_init[True] 4.22s call unit/comm/test_dist.py::TestDistArgs::test[hello-icosahedron-1138-purple] 4.22s call unit/comm/test_dist.py::TestDistInit::test_no_init[None] 4.21s setup unit/comm/test_dist.py::TestDistributedFixture::test[2-16] 4.21s setup unit/comm/test_dist.py::TestDistributedFixture::test[4-16] 4.01s call unit/comm/test_dist.py::TestDistInitNoEnv::test 4.01s call unit/comm/test_dist.py::TestDistInitWithModel::test_no_init[False] 4.01s call unit/comm/test_dist.py::TestDistributedFixture::test[2-32] 4.01s call unit/comm/test_dist.py::TestDistributedFixture::test[4-16] 4.01s call unit/comm/test_dist.py::TestDistributedFixture::test[2-16] 4.01s call unit/comm/test_dist.py::TestDistributedFixture::test[4-32] 3.91s call unit/comm/test_dist.py::TestDistInit::test_no_init[False] 3.91s call unit/comm/test_dist.py::TestWorldSizeOverrideDistTest::test_world_size_1 (40 durations < 1s hidden. Use -vv to show these durations.) ================= 22 passed, 12 warnings in 127.68s (0:02:07) ================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3236324661 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 3 items unit/compression/test_compression.py::TestCompression::test_linear_layer_compress PASSED [ 33%] unit/compression/test_compression.py::TestCompression::test_conv1d_convertion PASSED [ 66%] unit/compression/test_compression.py::TestCompression::test_mpu_compress SKIPPED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/compression/test_compression.py::TestCompression::test_linear_layer_compress /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/compression/test_compression.py::TestCompression::test_linear_layer_compress /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 4.86s call unit/compression/test_compression.py::TestCompression::test_linear_layer_compress 4.31s call unit/compression/test_compression.py::TestCompression::test_conv1d_convertion (6 durations < 1s hidden. Use -vv to show these durations.) ================== 2 passed, 1 skipped, 3 warnings in 10.10s =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=4149175560 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 12 items unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-True-1-4] PASSED [ 8%] unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-True-2-2] PASSED [ 16%] unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-False-2-2] PASSED [ 25%] unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-False-2-2] PASSED [ 33%] unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-False-1-4] PASSED [ 41%] unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-True-1-4] PASSED [ 50%] unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-True-1-2] PASSED [ 58%] unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-False-1-2] PASSED [ 66%] unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-False-1-2] PASSED [ 75%] unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-False-1-4] PASSED [ 83%] unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-True-2-2] PASSED [ 91%] unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-True-1-2] PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-True-1-4] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-True-1-4] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.82s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-False-2-2] 5.82s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-False-1-2] 5.72s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-True-1-4] 5.72s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-False-1-4] 5.72s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-True-1-2] 5.62s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-True-2-2] 5.52s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-False-1-4] 5.52s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-True-2-2] 5.38s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-True-1-4] 4.82s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[True-False-1-2] 4.72s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-False-2-2] 4.72s call unit/moe/test_moe_tp.py::TestMOETensorParallel::test[False-True-1-2] (24 durations < 1s hidden. Use -vv to show these durations.) ================== 12 passed, 3 warnings in 66.11s (0:01:06) =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=379413680 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 6 items unit/moe/test_moe.py::TestPRMoE::test[2-False] PASSED [ 16%] unit/moe/test_moe.py::TestPRMoE::test[2-True] PASSED [ 33%] unit/moe/test_moe.py::TestMoE::test[True-4] PASSED [ 50%] unit/moe/test_moe.py::TestMoE::test[True-2] PASSED [ 66%] unit/moe/test_moe.py::TestMoE::test[False-4] PASSED [ 83%] unit/moe/test_moe.py::TestMoE::test[False-2] PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/moe/test_moe.py::TestPRMoE::test[2-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/moe/test_moe.py::TestPRMoE::test[2-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 10.53s call unit/moe/test_moe.py::TestPRMoE::test[2-True] 10.41s call unit/moe/test_moe.py::TestPRMoE::test[2-False] 10.33s call unit/moe/test_moe.py::TestMoE::test[True-4] 9.93s call unit/moe/test_moe.py::TestMoE::test[False-4] 9.63s call unit/moe/test_moe.py::TestMoE::test[False-2] 9.13s call unit/moe/test_moe.py::TestMoE::test[True-2] (12 durations < 1s hidden. Use -vv to show these durations.) =================== 6 passed, 3 warnings in 60.91s (0:01:00) =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2867340693 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 6 items unit/monitor/test_monitor.py::TestWandB::test_empty_wandb FAILED [ 16%] unit/monitor/test_monitor.py::TestWandB::test_wandb FAILED [ 33%] unit/monitor/test_monitor.py::TestCSVMonitor::test_csv_monitor PASSED [ 50%] unit/monitor/test_monitor.py::TestCSVMonitor::test_empty_csv_monitor PASSED [ 66%] unit/monitor/test_monitor.py::TestTensorBoard::test_empty_tensorboard PASSED [ 83%] unit/monitor/test_monitor.py::TestTensorBoard::test_tensorboard PASSED [100%] =================================== FAILURES =================================== __________________________ TestWandB.test_empty_wandb __________________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:03:22,266] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl If you want to use wandb logging, please `pip install wandb` and follow the instructions at https://docs.wandb.ai/quickstart If you want to use wandb logging, please `pip install wandb` and follow the instructions at https://docs.wandb.ai/quickstart ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:03:23.230263 463215 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:03:23.230233 462960 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:03:23.237951 463217 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:03:23.237938 462959 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:03:23.238451 462959 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:03:23.242074 462960 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-2: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/monitor/test_monitor.py", line 66, in test_empty_wandb wandb_monitor = WandbMonitor(ds_config.monitor_config.wandb) File "/usr/local/lib/python3.7/site-packages/deepspeed/monitor/wandb.py", line 16, in __init__ check_wandb_availability() File "/usr/local/lib/python3.7/site-packages/deepspeed/monitor/utils.py", line 19, in check_wandb_availability import wandb # noqa: F401 File "/usr/local/lib/python3.7/site-packages/wandb/__init__.py", line 26, in from wandb import sdk as wandb_sdk File "/usr/local/lib/python3.7/site-packages/wandb/sdk/__init__.py", line 3, in from . import wandb_helper as helper # noqa: F401 File "/usr/local/lib/python3.7/site-packages/wandb/sdk/wandb_helper.py", line 6, in from .lib import config_util File "/usr/local/lib/python3.7/site-packages/wandb/sdk/lib/config_util.py", line 10, in from wandb.util import load_yaml File "/usr/local/lib/python3.7/site-packages/wandb/util.py", line 47, in import requests File "/usr/local/lib/python3.7/site-packages/requests/__init__.py", line 43, in import urllib3 File "/usr/local/lib/python3.7/site-packages/urllib3/__init__.py", line 39, in "urllib3 v2.0 only supports OpenSSL 1.1.1+, currently " ImportError: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with OpenSSL 1.0.2k-fips 26 Jan 2017. See: https://github.com/urllib3/urllib3/issues/2168 Process Process-1: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/monitor/test_monitor.py", line 66, in test_empty_wandb wandb_monitor = WandbMonitor(ds_config.monitor_config.wandb) File "/usr/local/lib/python3.7/site-packages/deepspeed/monitor/wandb.py", line 16, in __init__ check_wandb_availability() File "/usr/local/lib/python3.7/site-packages/deepspeed/monitor/utils.py", line 19, in check_wandb_availability import wandb # noqa: F401 File "/usr/local/lib/python3.7/site-packages/wandb/__init__.py", line 26, in from wandb import sdk as wandb_sdk File "/usr/local/lib/python3.7/site-packages/wandb/sdk/__init__.py", line 3, in from . import wandb_helper as helper # noqa: F401 File "/usr/local/lib/python3.7/site-packages/wandb/sdk/wandb_helper.py", line 6, in from .lib import config_util File "/usr/local/lib/python3.7/site-packages/wandb/sdk/lib/config_util.py", line 10, in from wandb.util import load_yaml File "/usr/local/lib/python3.7/site-packages/wandb/util.py", line 47, in import requests File "/usr/local/lib/python3.7/site-packages/requests/__init__.py", line 43, in import urllib3 File "/usr/local/lib/python3.7/site-packages/urllib3/__init__.py", line 39, in "urllib3 v2.0 only supports OpenSSL 1.1.1+, currently " ImportError: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with OpenSSL 1.0.2k-fips 26 Jan 2017. See: https://github.com/urllib3/urllib3/issues/2168 _____________________________ TestWandB.test_wandb _____________________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:03:27,083] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl If you want to use wandb logging, please `pip install wandb` and follow the instructions at https://docs.wandb.ai/quickstart If you want to use wandb logging, please `pip install wandb` and follow the instructions at https://docs.wandb.ai/quickstart ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:03:28.083428 463242 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:03:28.083456 463497 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:03:28.086393 463499 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:03:28.086391 463241 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:03:28.086941 463241 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:03:28.094296 463242 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-3: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/monitor/test_monitor.py", line 57, in test_wandb wandb_monitor = WandbMonitor(ds_config.monitor_config.wandb) File "/usr/local/lib/python3.7/site-packages/deepspeed/monitor/wandb.py", line 16, in __init__ check_wandb_availability() File "/usr/local/lib/python3.7/site-packages/deepspeed/monitor/utils.py", line 19, in check_wandb_availability import wandb # noqa: F401 File "/usr/local/lib/python3.7/site-packages/wandb/__init__.py", line 26, in from wandb import sdk as wandb_sdk File "/usr/local/lib/python3.7/site-packages/wandb/sdk/__init__.py", line 3, in from . import wandb_helper as helper # noqa: F401 File "/usr/local/lib/python3.7/site-packages/wandb/sdk/wandb_helper.py", line 6, in from .lib import config_util File "/usr/local/lib/python3.7/site-packages/wandb/sdk/lib/config_util.py", line 10, in from wandb.util import load_yaml File "/usr/local/lib/python3.7/site-packages/wandb/util.py", line 47, in import requests File "/usr/local/lib/python3.7/site-packages/requests/__init__.py", line 43, in import urllib3 File "/usr/local/lib/python3.7/site-packages/urllib3/__init__.py", line 39, in "urllib3 v2.0 only supports OpenSSL 1.1.1+, currently " ImportError: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with OpenSSL 1.0.2k-fips 26 Jan 2017. See: https://github.com/urllib3/urllib3/issues/2168 Process Process-4: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/monitor/test_monitor.py", line 57, in test_wandb wandb_monitor = WandbMonitor(ds_config.monitor_config.wandb) File "/usr/local/lib/python3.7/site-packages/deepspeed/monitor/wandb.py", line 16, in __init__ check_wandb_availability() File "/usr/local/lib/python3.7/site-packages/deepspeed/monitor/utils.py", line 19, in check_wandb_availability import wandb # noqa: F401 File "/usr/local/lib/python3.7/site-packages/wandb/__init__.py", line 26, in from wandb import sdk as wandb_sdk File "/usr/local/lib/python3.7/site-packages/wandb/sdk/__init__.py", line 3, in from . import wandb_helper as helper # noqa: F401 File "/usr/local/lib/python3.7/site-packages/wandb/sdk/wandb_helper.py", line 6, in from .lib import config_util File "/usr/local/lib/python3.7/site-packages/wandb/sdk/lib/config_util.py", line 10, in from wandb.util import load_yaml File "/usr/local/lib/python3.7/site-packages/wandb/util.py", line 47, in import requests File "/usr/local/lib/python3.7/site-packages/requests/__init__.py", line 43, in import urllib3 File "/usr/local/lib/python3.7/site-packages/urllib3/__init__.py", line 39, in "urllib3 v2.0 only supports OpenSSL 1.1.1+, currently " ImportError: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with OpenSSL 1.0.2k-fips 26 Jan 2017. See: https://github.com/urllib3/urllib3/issues/2168 =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/monitor/test_monitor.py::TestWandB::test_empty_wandb /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/monitor/test_monitor.py::TestWandB::test_empty_wandb /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.62s call unit/monitor/test_monitor.py::TestTensorBoard::test_tensorboard 5.29s call unit/monitor/test_monitor.py::TestWandB::test_empty_wandb 5.22s call unit/monitor/test_monitor.py::TestTensorBoard::test_empty_tensorboard 4.82s call unit/monitor/test_monitor.py::TestWandB::test_wandb 4.22s call unit/monitor/test_monitor.py::TestCSVMonitor::test_csv_monitor 4.21s call unit/monitor/test_monitor.py::TestCSVMonitor::test_empty_csv_monitor (12 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/monitor/test_monitor.py::TestWandB::test_empty_wandb FAILED unit/monitor/test_monitor.py::TestWandB::test_wandb =================== 2 failed, 4 passed, 3 warnings in 30.42s =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2456007268 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 2 items unit/pipe/test_pipe_module.py::TestPipeModuleSequential::test[True] PASSED [ 50%] unit/pipe/test_pipe_module.py::TestPipeModuleSequential::test[False] PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/pipe/test_pipe_module.py::TestPipeModuleSequential::test[True] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/pipe/test_pipe_module.py::TestPipeModuleSequential::test[True] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 12.17s call unit/pipe/test_pipe_module.py::TestPipeModuleSequential::test[True] 11.63s call unit/pipe/test_pipe_module.py::TestPipeModuleSequential::test[False] (4 durations < 1s hidden. Use -vv to show these durations.) ======================== 2 passed, 3 warnings in 24.73s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2679895035 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 2 items unit/profiling/flops_profiler/test_flops_profiler.py::TestFlopsProfiler::test PASSED [ 50%] unit/profiling/flops_profiler/test_flops_profiler.py::TestFlopsProfiler::test_flops_profiler_in_inference PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/profiling/flops_profiler/test_flops_profiler.py::TestFlopsProfiler::test /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/profiling/flops_profiler/test_flops_profiler.py::TestFlopsProfiler::test /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 9.01s call unit/profiling/flops_profiler/test_flops_profiler.py::TestFlopsProfiler::test 4.32s call unit/profiling/flops_profiler/test_flops_profiler.py::TestFlopsProfiler::test_flops_profiler_in_inference (4 durations < 1s hidden. Use -vv to show these durations.) ======================== 2 passed, 3 warnings in 14.36s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=345477723 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 2 items unit/checkpoint/test_latest_checkpoint.py::TestLatestCheckpoint::test_missing_latest PASSED [ 50%] unit/checkpoint/test_latest_checkpoint.py::TestLatestCheckpoint::test_existing_latest PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/checkpoint/test_latest_checkpoint.py::TestLatestCheckpoint::test_missing_latest /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/checkpoint/test_latest_checkpoint.py::TestLatestCheckpoint::test_missing_latest /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 8.22s call unit/checkpoint/test_latest_checkpoint.py::TestLatestCheckpoint::test_existing_latest 5.20s call unit/checkpoint/test_latest_checkpoint.py::TestLatestCheckpoint::test_missing_latest (4 durations < 1s hidden. Use -vv to show these durations.) ======================== 2 passed, 3 warnings in 14.44s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=849818478 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 12 items unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[0-False] PASSED [ 8%] unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[3-False] PASSED [ 16%] unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[2-False] PASSED [ 25%] unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[2-True] FAILED [ 33%] unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[3-True] FAILED [ 41%] unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[2-False] PASSED [ 50%] unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[1-False] PASSED [ 58%] unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[0-False] PASSED [ 66%] unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[1-False] PASSED [ 75%] unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[3-False] PASSED [ 83%] unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[3-True] FAILED [ 91%] unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[2-True] FAILED [100%] =================================== FAILURES =================================== ______ TestLRSchedulerCheckpoint.test_checkpoint_no_lr_scheduler[2-True] _______ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:05:26,540] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:05:27,817] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:05:27,818] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:05:27,820] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:05:27,870] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:05:27.547930 467700 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:05:27.547915 467445 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:05:27.555192 467702 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:05:27.555171 467444 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:05:27.557018 467444 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:05:27.559008 467445 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:05:27.822145 467444 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:05:27.822170 467726 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:05:27.824822 467727 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:05:27.824822 467445 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-7: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/test_lr_scheduler.py", line 113, in test_checkpoint_no_lr_scheduler load_lr_scheduler_states=False) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 159, in checkpoint_correctness_verification ds_model = create_deepspeed_model(config_dict=config_dict, model=models[0], base_optimizer=base_optimizers[0]) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 141, in create_deepspeed_model optimizer=base_optimizer) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-8: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/test_lr_scheduler.py", line 113, in test_checkpoint_no_lr_scheduler load_lr_scheduler_states=False) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 159, in checkpoint_correctness_verification ds_model = create_deepspeed_model(config_dict=config_dict, model=models[0], base_optimizer=base_optimizers[0]) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 141, in create_deepspeed_model optimizer=base_optimizer) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' ________ TestLRSchedulerCheckpoint.test_checkpoint_lr_scheduler[3-True] ________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:05:32,292] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:05:33,504] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:05:33,504] [INFO] [partition_parameters.py:454:__exit__] finished initializing model with 0.00B parameters [2023-05-27 03:05:33,505] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:05:33,506] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:05:33,511] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:05:33.257259 467760 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:05:33.257406 468015 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:05:33.265378 468017 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:05:33.265358 467759 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:05:33.266268 467759 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:05:33.268513 467760 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:05:33.508345 468041 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:05:33.508340 467760 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:05:33.510841 467759 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:05:33.510848 468042 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-10: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/test_lr_scheduler.py", line 69, in test_checkpoint_lr_scheduler load_lr_scheduler_states=True) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 159, in checkpoint_correctness_verification ds_model = create_deepspeed_model(config_dict=config_dict, model=models[0], base_optimizer=base_optimizers[0]) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 141, in create_deepspeed_model optimizer=base_optimizer) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-9: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/test_lr_scheduler.py", line 69, in test_checkpoint_lr_scheduler load_lr_scheduler_states=True) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 159, in checkpoint_correctness_verification ds_model = create_deepspeed_model(config_dict=config_dict, model=models[0], base_optimizer=base_optimizers[0]) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 141, in create_deepspeed_model optimizer=base_optimizer) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' ______ TestLRSchedulerCheckpoint.test_checkpoint_no_lr_scheduler[3-True] _______ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:06:27,390] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:06:27,696] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:06:27,699] [INFO] [partition_parameters.py:454:__exit__] finished initializing model with 0.00B parameters [2023-05-27 03:06:27,700] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:06:27,702] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:06:27,708] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:06:27.427412 469878 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:06:27.427402 469623 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:06:27.432334 469622 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:06:27.432348 469880 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:06:27.432655 469622 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:06:27.438056 469623 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:06:27.699025 469623 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:06:27.699033 469904 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:06:27.707826 469622 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:06:27.707891 469905 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-21: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/test_lr_scheduler.py", line 113, in test_checkpoint_no_lr_scheduler load_lr_scheduler_states=False) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 159, in checkpoint_correctness_verification ds_model = create_deepspeed_model(config_dict=config_dict, model=models[0], base_optimizer=base_optimizers[0]) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 141, in create_deepspeed_model optimizer=base_optimizer) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-22: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/test_lr_scheduler.py", line 113, in test_checkpoint_no_lr_scheduler load_lr_scheduler_states=False) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 159, in checkpoint_correctness_verification ds_model = create_deepspeed_model(config_dict=config_dict, model=models[0], base_optimizer=base_optimizers[0]) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 141, in create_deepspeed_model optimizer=base_optimizer) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' ________ TestLRSchedulerCheckpoint.test_checkpoint_lr_scheduler[2-True] ________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:06:32,230] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:06:33,319] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:06:33,320] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:06:33,321] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:06:33,361] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:06:33.054982 470186 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:06:33.054977 469931 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:06:33.061168 470188 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:06:33.061168 469930 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:06:33.061695 469930 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:06:33.065780 469931 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:06:33.323458 470212 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:06:33.323458 469931 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:06:33.325616 470213 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:06:33.325613 469930 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-24: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/test_lr_scheduler.py", line 69, in test_checkpoint_lr_scheduler load_lr_scheduler_states=True) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 159, in checkpoint_correctness_verification ds_model = create_deepspeed_model(config_dict=config_dict, model=models[0], base_optimizer=base_optimizers[0]) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 141, in create_deepspeed_model optimizer=base_optimizer) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-23: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/test_lr_scheduler.py", line 69, in test_checkpoint_lr_scheduler load_lr_scheduler_states=True) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 159, in checkpoint_correctness_verification ds_model = create_deepspeed_model(config_dict=config_dict, model=models[0], base_optimizer=base_optimizers[0]) File "/home/aishsh/ds-v0.9.2/tests/unit/checkpoint/common.py", line 141, in create_deepspeed_model optimizer=base_optimizer) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[0-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[0-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 12.13s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[3-False] 10.92s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[3-False] 10.22s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[1-False] 10.12s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[2-False] 9.32s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[2-False] 9.22s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[0-False] 9.08s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[0-False] 9.03s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[1-False] 5.81s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[2-True] 5.61s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[3-True] 5.52s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[2-True] 4.78s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[3-True] (24 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[2-True] FAILED unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[3-True] FAILED unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[3-True] FAILED unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_lr_scheduler[2-True] ============= 4 failed, 8 passed, 3 warnings in 102.86s (0:01:42) ============== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1888914074 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 5 items unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe_and_zero[4-False] PASSED [ 20%] unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe_and_zero[2-True] PASSED [ 40%] unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe[4] PASSED [ 60%] unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe_and_zero[4-True] PASSED [ 80%] unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe_and_zero[2-False] PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe_and_zero[4-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe_and_zero[4-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 11.09s call unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe_and_zero[4-False] 10.63s call unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe_and_zero[4-True] 10.44s call unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe_and_zero[2-False] 10.33s call unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe_and_zero[2-True] 10.13s call unit/checkpoint/test_moe_checkpoint.py::TestMoECheckpoint::test_checkpoint_moe[4] (10 durations < 1s hidden. Use -vv to show these durations.) ======================== 5 passed, 3 warnings in 53.57s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3994728316 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 3 items unit/checkpoint/test_other_optimizer.py::TestOtherOptimizerCheckpoint::test_checkpoint_fused_optimizer PASSED [ 33%] unit/checkpoint/test_other_optimizer.py::TestOtherOptimizerCheckpoint::test_checkpoint_unfused_optimizer PASSED [ 66%] unit/checkpoint/test_other_optimizer.py::TestOtherOptimizerCheckpoint::test_checkpoint_fp32_optimizer PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/checkpoint/test_other_optimizer.py::TestOtherOptimizerCheckpoint::test_checkpoint_fused_optimizer /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/checkpoint/test_other_optimizer.py::TestOtherOptimizerCheckpoint::test_checkpoint_fused_optimizer /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 10.52s call unit/checkpoint/test_other_optimizer.py::TestOtherOptimizerCheckpoint::test_checkpoint_fused_optimizer 10.42s call unit/checkpoint/test_other_optimizer.py::TestOtherOptimizerCheckpoint::test_checkpoint_unfused_optimizer 9.32s call unit/checkpoint/test_other_optimizer.py::TestOtherOptimizerCheckpoint::test_checkpoint_fp32_optimizer (6 durations < 1s hidden. Use -vv to show these durations.) ======================== 3 passed, 3 warnings in 31.22s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1138316618 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 4 items unit/checkpoint/test_reshape_checkpoint.py::test_reshape_222_to_211 PASSED [ 25%] unit/checkpoint/test_reshape_checkpoint.py::test_reshape_222_to_121 PASSED [ 50%] unit/checkpoint/test_reshape_checkpoint.py::test_reshape_222_to_111 PASSED [ 75%] unit/checkpoint/test_reshape_checkpoint.py::test_reshape_222_to_122 PASSED [100%] =============================== warnings summary =============================== unit/checkpoint/test_reshape_checkpoint.py::test_reshape_222_to_211 /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/checkpoint/test_reshape_checkpoint.py::test_reshape_222_to_211 /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (12 durations < 1s hidden. Use -vv to show these durations.) ======================== 4 passed, 2 warnings in 0.97s ========================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3543619067 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 4 items unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unique_tag[IGNORE] PASSED [ 25%] unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unique_tag[WARN] PASSED [ 50%] unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unique_tag[FAIL] PASSED [ 75%] unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unknown_tag_validation PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unique_tag[IGNORE] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unique_tag[IGNORE] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 6.42s call unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unique_tag[IGNORE] 6.02s call unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unique_tag[WARN] 5.22s call unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unknown_tag_validation 5.12s call unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unique_tag[FAIL] (8 durations < 1s hidden. Use -vv to show these durations.) ======================== 4 passed, 3 warnings in 23.72s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1608075071 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 3 items unit/checkpoint/test_pipeline.py::TestPipelineCheckpoint::test_checkpoint_pipe_engine[0] PASSED [ 33%] unit/checkpoint/test_pipeline.py::TestPipelineCheckpoint::test_checkpoint_pipe_engine[1] PASSED [ 66%] unit/checkpoint/test_pipeline.py::TestPipelineCheckpoint::test_checkpoint_pipe_module[base_topo0-test_topo0] SKIPPED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/checkpoint/test_pipeline.py::TestPipelineCheckpoint::test_checkpoint_pipe_engine[0] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/checkpoint/test_pipeline.py::TestPipelineCheckpoint::test_checkpoint_pipe_engine[0] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 13.90s call unit/checkpoint/test_pipeline.py::TestPipelineCheckpoint::test_checkpoint_pipe_engine[0] 13.75s call unit/checkpoint/test_pipeline.py::TestPipelineCheckpoint::test_checkpoint_pipe_engine[1] (6 durations < 1s hidden. Use -vv to show these durations.) ================== 2 passed, 1 skipped, 3 warnings in 28.58s =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2965898371 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 9 items unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-False-True-True] PASSED [ 11%] unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[False-False-False-False] PASSED [ 22%] unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-True-True-True] PASSED [ 33%] unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-False-False-False] PASSED [ 44%] unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-False-True-False] PASSED [ 55%] unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[False-False-True-True] PASSED [ 66%] unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-True-False-False] PASSED [ 77%] unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-True-True-False] PASSED [ 88%] unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[False-False-True-False] PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-False-True-True] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-False-True-True] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.77s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-False-True-True] 5.41s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-True-False-False] 5.21s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-True-True-False] 4.41s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-True-True-True] 4.41s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-False-False-False] 4.41s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[False-False-True-False] 4.31s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-False-True-False] 4.31s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[False-False-False-False] 4.31s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[False-False-True-True] (18 durations < 1s hidden. Use -vv to show these durations.) ======================== 9 passed, 3 warnings in 43.53s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2213585216 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 53 items unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_optimizer_state[3] PASSED [ 1%] unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_not_load_optimizer_state[2] PASSED [ 3%] unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_optimizer_state[2] PASSED [ 5%] unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_module_only[3] PASSED [ 7%] unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_not_load_optimizer_state[1] PASSED [ 9%] unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_module_only[1] PASSED [ 11%] unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_module_only[2] PASSED [ 13%] unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_not_load_optimizer_state[3] PASSED [ 15%] unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_optimizer_state[1] PASSED [ 16%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_save_before_accum_grad_is_done[1] PASSED [ 18%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[3] PASSED [ 20%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_load_immediate_save[3] PASSED [ 22%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_save_before_accum_grad_is_done[0] PASSED [ 24%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_load_immediate_save[0] PASSED [ 26%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_save_before_accum_grad_is_done[2] PASSED [ 28%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[0] PASSED [ 30%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_load_immediate_save[2] PASSED [ 32%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_load_immediate_save[1] PASSED [ 33%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[1] PASSED [ 35%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[2] PASSED [ 37%] unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_save_before_accum_grad_is_done[3] PASSED [ 39%] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-True-True] PASSED [ 41%] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_fixed_dp[False-False-False] PASSED [ 43%] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_fixed_dp[True-True-True] PASSED [ 45%] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-False-False] PASSED [ 47%] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[True-True-False] PASSED [ 49%] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_fixed_dp[True-True-False] PASSED [ 50%] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_fixed_dp[True-False-False] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-True-True] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-True-True] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-False-False] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-False-False] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[True-True-False] unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[True-True-False] /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_optimizer_state[3] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_optimizer_state[3] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 12.38s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_optimizer_state[3] 11.93s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_not_load_optimizer_state[3] 11.74s setup unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-False-False] 10.83s call unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_fixed_dp[False-False-False] 10.72s call unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_fixed_dp[True-True-True] 10.52s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_module_only[3] 10.52s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_optimizer_state[1] 10.33s call unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_fixed_dp[True-True-False] 10.32s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_load_immediate_save[3] 10.14s setup unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[True-True-False] 10.13s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_load_immediate_save[1] 10.02s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_load_immediate_save[2] 9.83s setup unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-True-True] 9.83s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_module_only[2] 9.83s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_save_before_accum_grad_is_done[3] 9.82s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_not_load_optimizer_state[1] 9.73s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_save_before_accum_grad_is_done[0] 9.62s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_not_load_optimizer_state[2] 9.52s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_module_only[1] 9.43s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_load_optimizer_state[2] 9.42s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_load_immediate_save[0] 9.32s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_save_before_accum_grad_is_done[1] 8.72s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_save_before_accum_grad_is_done[2] 7.23s call unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-False-False] 7.12s call unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-True-True] 6.92s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[3] 6.52s call unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[True-True-False] 6.32s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[1] 6.22s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[0] 6.12s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[2] (52 durations < 1s hidden. Use -vv to show these durations.) ================== 27 passed, 9 warnings in 882.83s (0:14:42) ================== !!!!!!!!!!!!!!!!! _pytest.outcomes.Exit: Test hanged, exiting !!!!!!!!!!!!!!!!!! ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3942688220 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 22 items unit/elasticity/test_elastic.py::TestElasticConfigChanged::test PASSED [ 4%] unit/elasticity/test_elastic.py::TestNonElasticBatchParamsWithOverride::test PASSED [ 9%] unit/elasticity/test_elastic.py::TestNonElasticBatchParams::test PASSED [ 13%] unit/elasticity/test_elastic.py::test_missing_micro_batch PASSED [ 18%] unit/elasticity/test_elastic.py::test_future_elastic_version PASSED [ 22%] unit/elasticity/test_elastic.py::test_model_parallel_v1_invalid PASSED [ 27%] unit/elasticity/test_elastic.py::test_valid_world_size PASSED [ 31%] unit/elasticity/test_elastic.py::test_model_parallel_v2_invalid PASSED [ 36%] unit/elasticity/test_elastic.py::test_proper_mbsz PASSED [ 40%] unit/elasticity/test_elastic.py::test_invalid_config_values[micro_batch_sizes-5] PASSED [ 45%] unit/elasticity/test_elastic.py::test_basic_10k PASSED [ 50%] unit/elasticity/test_elastic.py::test_missing_max_batch PASSED [ 54%] unit/elasticity/test_elastic.py::test_model_parallel_v2_valid PASSED [ 59%] unit/elasticity/test_elastic.py::test_invalid_config_values[min_gpus--1] PASSED [ 63%] unit/elasticity/test_elastic.py::test_invalid_world_size PASSED [ 68%] unit/elasticity/test_elastic.py::test_disabled PASSED [ 72%] unit/elasticity/test_elastic.py::test_invalid_config_values[max_gpus--1] PASSED [ 77%] unit/elasticity/test_elastic.py::test_empty_config PASSED [ 81%] unit/elasticity/test_elastic.py::test_invalid_config_values[micro_batch_sizes-value5] PASSED [ 86%] unit/elasticity/test_elastic.py::test_invalid_config_values[micro_batch_sizes-value4] PASSED [ 90%] unit/elasticity/test_elastic.py::test_old_version PASSED [ 95%] unit/elasticity/test_elastic.py::test_invalid_config_values[micro_batch_sizes-value0] PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/elasticity/test_elastic.py::TestElasticConfigChanged::test /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/elasticity/test_elastic.py::TestElasticConfigChanged::test /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 6.12s call unit/elasticity/test_elastic.py::TestNonElasticBatchParamsWithOverride::test 4.79s call unit/elasticity/test_elastic.py::TestElasticConfigChanged::test 4.31s call unit/elasticity/test_elastic.py::TestNonElasticBatchParams::test (63 durations < 1s hidden. Use -vv to show these durations.) ======================= 22 passed, 3 warnings in 16.26s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1902720899 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 4 items / 4 deselected / 0 selected =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ======================= 4 deselected, 1 warning in 0.92s ======================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2540101885 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 error ==================================== ERRORS ==================================== _________ ERROR collecting unit/inference/test_checkpoint_sharding.py __________ ImportError while importing test module '/home/aishsh/ds-v0.9.2/tests/unit/inference/test_checkpoint_sharding.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /usr/local/lib/python3.7/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) unit/inference/test_checkpoint_sharding.py:12: in from transformers import AutoConfig, AutoModelForCausalLM /usr/local/lib/python3.7/site-packages/transformers/__init__.py:26: in from . import dependency_versions_check /usr/local/lib/python3.7/site-packages/transformers/dependency_versions_check.py:17: in from .utils.versions import require_version, require_version_core /usr/local/lib/python3.7/site-packages/transformers/utils/__init__.py:30: in from .generic import ( /usr/local/lib/python3.7/site-packages/transformers/utils/generic.py:29: in from .import_utils import is_flax_available, is_tf_available, is_torch_available, is_torch_fx_proxy /usr/local/lib/python3.7/site-packages/transformers/utils/import_utils.py:32: in from . import logging /usr/local/lib/python3.7/site-packages/transformers/utils/logging.py:35: in import huggingface_hub.utils as hf_hub_utils /usr/local/lib/python3.7/site-packages/huggingface_hub/utils/__init__.py:32: in from ._errors import ( /usr/local/lib/python3.7/site-packages/huggingface_hub/utils/_errors.py:3: in from requests import HTTPError, Response /usr/local/lib/python3.7/site-packages/requests/__init__.py:43: in import urllib3 /usr/local/lib/python3.7/site-packages/urllib3/__init__.py:39: in "urllib3 v2.0 only supports OpenSSL 1.1.1+, currently " E ImportError: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with OpenSSL 1.0.2k-fips 26 Jan 2017. See: https://github.com/urllib3/urllib3/issues/2168 =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html =========================== short test summary info ============================ ERROR unit/inference/test_checkpoint_sharding.py !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! ========================= 1 warning, 1 error in 1.44s ========================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1244343831 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 error ==================================== ERRORS ==================================== ______________ ERROR collecting unit/inference/test_inference.py _______________ ImportError while importing test module '/home/aishsh/ds-v0.9.2/tests/unit/inference/test_inference.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /usr/local/lib/python3.7/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) unit/inference/test_inference.py:16: in from transformers import pipeline /usr/local/lib/python3.7/site-packages/transformers/__init__.py:26: in from . import dependency_versions_check /usr/local/lib/python3.7/site-packages/transformers/dependency_versions_check.py:17: in from .utils.versions import require_version, require_version_core /usr/local/lib/python3.7/site-packages/transformers/utils/__init__.py:30: in from .generic import ( /usr/local/lib/python3.7/site-packages/transformers/utils/generic.py:29: in from .import_utils import is_flax_available, is_tf_available, is_torch_available, is_torch_fx_proxy /usr/local/lib/python3.7/site-packages/transformers/utils/import_utils.py:32: in from . import logging /usr/local/lib/python3.7/site-packages/transformers/utils/logging.py:35: in import huggingface_hub.utils as hf_hub_utils /usr/local/lib/python3.7/site-packages/huggingface_hub/utils/__init__.py:32: in from ._errors import ( /usr/local/lib/python3.7/site-packages/huggingface_hub/utils/_errors.py:3: in from requests import HTTPError, Response /usr/local/lib/python3.7/site-packages/requests/__init__.py:43: in import urllib3 /usr/local/lib/python3.7/site-packages/urllib3/__init__.py:39: in "urllib3 v2.0 only supports OpenSSL 1.1.1+, currently " E ImportError: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with OpenSSL 1.0.2k-fips 26 Jan 2017. See: https://github.com/urllib3/urllib3/issues/2168 =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html =========================== short test summary info ============================ ERROR unit/inference/test_inference.py !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! ========================= 1 warning, 1 error in 1.45s ========================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=208031903 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 error ==================================== ERRORS ==================================== ___________ ERROR collecting unit/inference/test_model_profiling.py ____________ ImportError while importing test module '/home/aishsh/ds-v0.9.2/tests/unit/inference/test_model_profiling.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: /usr/local/lib/python3.7/importlib/__init__.py:127: in import_module return _bootstrap._gcd_import(name[level:], package, level) unit/inference/test_model_profiling.py:11: in from transformers import pipeline /usr/local/lib/python3.7/site-packages/transformers/__init__.py:26: in from . import dependency_versions_check /usr/local/lib/python3.7/site-packages/transformers/dependency_versions_check.py:17: in from .utils.versions import require_version, require_version_core /usr/local/lib/python3.7/site-packages/transformers/utils/__init__.py:30: in from .generic import ( /usr/local/lib/python3.7/site-packages/transformers/utils/generic.py:29: in from .import_utils import is_flax_available, is_tf_available, is_torch_available, is_torch_fx_proxy /usr/local/lib/python3.7/site-packages/transformers/utils/import_utils.py:32: in from . import logging /usr/local/lib/python3.7/site-packages/transformers/utils/logging.py:35: in import huggingface_hub.utils as hf_hub_utils /usr/local/lib/python3.7/site-packages/huggingface_hub/utils/__init__.py:32: in from ._errors import ( /usr/local/lib/python3.7/site-packages/huggingface_hub/utils/_errors.py:3: in from requests import HTTPError, Response /usr/local/lib/python3.7/site-packages/requests/__init__.py:43: in import urllib3 /usr/local/lib/python3.7/site-packages/urllib3/__init__.py:39: in "urllib3 v2.0 only supports OpenSSL 1.1.1+, currently " E ImportError: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with OpenSSL 1.0.2k-fips 26 Jan 2017. See: https://github.com/urllib3/urllib3/issues/2168 =========================== short test summary info ============================ ERROR unit/inference/test_model_profiling.py !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!! =============================== 1 error in 1.49s =============================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=259071128 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 7 items unit/launcher/test_ds_arguments.py::test_core_binding_arguments PASSED [ 14%] unit/launcher/test_ds_arguments.py::test_no_ds_parser PASSED [ 28%] unit/launcher/test_ds_arguments.py::test_no_ds_arguments PASSED [ 42%] unit/launcher/test_ds_arguments.py::test_no_ds_config_argument PASSED [ 57%] unit/launcher/test_ds_arguments.py::test_core_deepscale_arguments PASSED [ 71%] unit/launcher/test_ds_arguments.py::test_no_ds_arguments_no_ds_parser PASSED [ 85%] unit/launcher/test_ds_arguments.py::test_no_ds_enable_argument PASSED [100%] =============================== warnings summary =============================== unit/launcher/test_ds_arguments.py::test_core_binding_arguments /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/launcher/test_ds_arguments.py::test_core_binding_arguments /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (21 durations < 1s hidden. Use -vv to show these durations.) ======================== 7 passed, 2 warnings in 0.93s ========================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2195940584 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 5 items unit/launcher/test_multinode_runner.py::test_slurm_runner PASSED [ 20%] unit/launcher/test_multinode_runner.py::test_mvapich_runner PASSED [ 40%] unit/launcher/test_multinode_runner.py::test_mpich_runner PASSED [ 60%] unit/launcher/test_multinode_runner.py::test_openmpi_runner PASSED [ 80%] unit/launcher/test_multinode_runner.py::test_pdsh_runner PASSED [100%] =============================== warnings summary =============================== unit/launcher/test_multinode_runner.py::test_slurm_runner /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/launcher/test_multinode_runner.py::test_slurm_runner /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (15 durations < 1s hidden. Use -vv to show these durations.) ======================== 5 passed, 2 warnings in 0.92s ========================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1623618086 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 7 items unit/launcher/test_run.py::test_parser_mutual_exclusive PASSED [ 14%] unit/launcher/test_run.py::test_num_plus_parser PASSED [ 28%] unit/launcher/test_run.py::test_hostfiles_bad PASSED [ 42%] unit/launcher/test_run.py::test_hostfile_good PASSED [ 57%] unit/launcher/test_run.py::test_parser_errors PASSED [ 71%] unit/launcher/test_run.py::test_parser_multinode PASSED [ 85%] unit/launcher/test_run.py::test_parser_local PASSED [100%] =============================== warnings summary =============================== unit/launcher/test_run.py::test_parser_mutual_exclusive /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/launcher/test_run.py::test_parser_mutual_exclusive /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (21 durations < 1s hidden. Use -vv to show these durations.) ======================== 7 passed, 2 warnings in 0.94s ========================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=761628897 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 3 items unit/model_parallelism/test_configurable_parallel_mp.py::TestConfigurableMP::test_gpt2_basic SKIPPED [ 33%] unit/model_parallelism/test_configurable_parallel_mp.py::TestConfigurableMP::test_gpt2_mp2_no_resize SKIPPED [ 66%] unit/model_parallelism/test_configurable_parallel_mp.py::TestConfigurableResizeMP::test SKIPPED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/model_parallelism/test_configurable_parallel_mp.py:62 /home/aishsh/ds-v0.9.2/tests/unit/model_parallelism/test_configurable_parallel_mp.py:62: PytestUnknownMarkWarning: Unknown pytest.mark.world_size - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html @pytest.mark.world_size(1) unit/model_parallelism/test_configurable_parallel_mp.py:90 /home/aishsh/ds-v0.9.2/tests/unit/model_parallelism/test_configurable_parallel_mp.py:90: PytestUnknownMarkWarning: Unknown pytest.mark.world_size - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html @pytest.mark.world_size(2) -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (6 durations < 1s hidden. Use -vv to show these durations.) ======================== 3 skipped, 3 warnings in 0.91s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=625012969 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 7 items unit/model_parallelism/test_configurable_parallel_pp.py::TestConfigurablePP::test_pp_basic SKIPPED [ 14%] unit/model_parallelism/test_configurable_parallel_pp.py::TestConfigurableResizePP::test_world_size_2to4[1-2-1-4] SKIPPED [ 28%] unit/model_parallelism/test_configurable_parallel_pp.py::TestConfigurableResizePP::test_world_size_4to2[2-2-2-1] SKIPPED [ 42%] unit/model_parallelism/test_configurable_parallel_pp.py::TestConfigurableResizePP::test_world_size_2to1[1-2-1-1] SKIPPED [ 57%] unit/model_parallelism/test_configurable_parallel_pp.py::TestConfigurableResizePP::test_world_size_4to1[2-2-1-1] SKIPPED [ 71%] unit/model_parallelism/test_configurable_parallel_pp.py::TestConfigurableResizePP::test_world_size_1to4[1-1-2-2] SKIPPED [ 85%] unit/model_parallelism/test_configurable_parallel_pp.py::TestConfigurableResizePP::test_world_size_2to4[2-1-2-2] SKIPPED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/model_parallelism/test_configurable_parallel_pp.py:236 /home/aishsh/ds-v0.9.2/tests/unit/model_parallelism/test_configurable_parallel_pp.py:236: PytestUnknownMarkWarning: Unknown pytest.mark.world_size - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html @pytest.mark.world_size(1) unit/model_parallelism/test_configurable_parallel_pp.py:243 /home/aishsh/ds-v0.9.2/tests/unit/model_parallelism/test_configurable_parallel_pp.py:243: PytestUnknownMarkWarning: Unknown pytest.mark.world_size - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html @pytest.mark.world_size(1) unit/model_parallelism/test_configurable_parallel_pp.py:250 /home/aishsh/ds-v0.9.2/tests/unit/model_parallelism/test_configurable_parallel_pp.py:250: PytestUnknownMarkWarning: Unknown pytest.mark.world_size - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html @pytest.mark.world_size(2) unit/model_parallelism/test_configurable_parallel_pp.py:257 /home/aishsh/ds-v0.9.2/tests/unit/model_parallelism/test_configurable_parallel_pp.py:257: PytestUnknownMarkWarning: Unknown pytest.mark.world_size - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html @pytest.mark.world_size(4) unit/model_parallelism/test_configurable_parallel_pp.py:264 /home/aishsh/ds-v0.9.2/tests/unit/model_parallelism/test_configurable_parallel_pp.py:264: PytestUnknownMarkWarning: Unknown pytest.mark.world_size - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html @pytest.mark.world_size(4) -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (14 durations < 1s hidden. Use -vv to show these durations.) ======================== 7 skipped, 6 warnings in 0.93s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=532039706 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 16 items unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-False-False-resulting_optimizer0] PASSED [ 6%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-False-True-resulting_optimizer14] FAILED [ 12%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-True-True-resulting_optimizer5] PASSED [ 18%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-False-True-False-resulting_optimizer9] PASSED [ 25%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-True-False-resulting_optimizer3] PASSED [ 31%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-False-False-False-resulting_optimizer8] PASSED [ 37%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-False-False-resulting_optimizer2] FAILED [ 43%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-True-False-resulting_optimizer1] PASSED [ 50%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-False-True-True-resulting_optimizer13] PASSED [ 56%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-False-False-True-resulting_optimizer12] PASSED [ 62%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-False-True-resulting_optimizer4] PASSED [ 68%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-True-True-resulting_optimizer15] PASSED [ 75%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-True-False-resulting_optimizer11] PASSED [ 81%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-False-True-resulting_optimizer6] FAILED [ 87%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-False-False-resulting_optimizer10] FAILED [ 93%] unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-True-True-resulting_optimizer7] PASSED [100%] =================================== FAILURES =================================== _______ TestAdamConfigs.test[Adam-True-False-True-resulting_optimizer14] _______ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:25:39,433] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:25:39,620] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:25:39,621] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:25:39,657] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:25:39.435680 493269 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:25:39.435679 493140 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:25:39.436115 493140 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:25:39.625730 493282 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:25:39.625730 493140 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-2: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_adamw.py", line 69, in test model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' ______ TestAdamConfigs.test[AdamW-True-False-False-resulting_optimizer2] _______ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:26:02,259] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:26:02,444] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:26:02,445] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:26:02,482] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:26:02.261610 494017 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:26:02.261608 493888 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:26:02.261962 493888 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:26:02.449623 494030 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:26:02.449623 493888 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-7: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_adamw.py", line 69, in test model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _______ TestAdamConfigs.test[AdamW-True-False-True-resulting_optimizer6] _______ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:26:35,648] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:26:35,804] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:26:35,805] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:26:35,840] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:26:35.649827 494931 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:26:35.649842 495060 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:26:35.650161 494931 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:26:35.809242 494931 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:26:35.809260 495073 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-14: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_adamw.py", line 69, in test model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' ______ TestAdamConfigs.test[Adam-True-False-False-resulting_optimizer10] _______ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:26:40,169] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:26:40,345] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:26:40,346] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:26:40,386] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:26:40.170698 495219 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:26:40.170684 495090 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:26:40.171097 495090 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:26:40.349159 495090 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:26:40.349182 495232 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-15: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_adamw.py", line 69, in test model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-False-False-resulting_optimizer0] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-False-False-resulting_optimizer0] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.71s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-False-True-resulting_optimizer4] 5.67s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-False-False-resulting_optimizer0] 5.51s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-False-False-True-resulting_optimizer12] 5.01s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-False-False-False-resulting_optimizer8] 4.81s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-False-True-resulting_optimizer14] 4.61s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-True-True-resulting_optimizer15] 4.51s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-False-True-resulting_optimizer6] 4.51s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-False-False-resulting_optimizer2] 4.51s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-False-False-resulting_optimizer10] 4.41s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-True-False-resulting_optimizer11] 4.41s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-True-True-resulting_optimizer5] 4.31s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-True-False-resulting_optimizer3] 4.31s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-False-True-True-resulting_optimizer13] 4.31s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-False-True-False-resulting_optimizer9] 4.21s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-True-True-resulting_optimizer7] 4.21s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-True-False-resulting_optimizer1] (32 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-False-True-resulting_optimizer14] FAILED unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-False-False-resulting_optimizer2] FAILED unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-False-True-resulting_optimizer6] FAILED unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-False-False-resulting_optimizer10] ============= 4 failed, 12 passed, 3 warnings in 76.13s (0:01:16) ============== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1468307262 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 21 items unit/ops/adam/test_cpu_adam.py::TestCPUAdamGPUError::test_cpu_adam_gpu_error FAILED [ 4%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1024-fp32] PASSED [ 9%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[22-fp16] PASSED [ 14%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1024-fp16] FAILED [ 19%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1048576-fp32] FAILED [ 23%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1048576-fp16] FAILED [ 28%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1024-fp32] FAILED [ 33%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[22-fp32] FAILED [ 38%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[128-fp32] PASSED [ 42%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[64-fp16] PASSED [ 47%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[22-fp32] PASSED [ 52%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[64-fp32] FAILED [ 57%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1048576-fp32] PASSED [ 61%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[64-fp16] FAILED [ 66%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1024-fp16] PASSED [ 71%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[128-fp16] FAILED [ 76%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[64-fp32] PASSED [ 80%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[128-fp32] FAILED [ 85%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1048576-fp16] PASSED [ 90%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[22-fp16] FAILED [ 95%] unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[128-fp16] PASSED [100%] =================================== FAILURES =================================== _________________ TestCPUAdamGPUError.test_cpu_adam_gpu_error __________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:26:55,522] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:26:55.670020 495796 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:26:55.670010 495519 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:26:55.678297 495518 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:26:55.678390 495798 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:26:55.678975 495518 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:26:55.681064 495519 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-1: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 122, in test_cpu_adam_gpu_error optimizer = DeepSpeedCPUAdam([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-2: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 122, in test_cpu_adam_gpu_error optimizer = DeepSpeedCPUAdam([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _________________ TestCPUAdam.test_fused_adam_equal[1024-fp16] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:27:11,125] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:27:11.127720 496292 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:27:11.127709 496152 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:27:11.128178 496152 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-5: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 80, in test_fused_adam_equal cpu_optimizer = DeepSpeedCPUAdam([cpu_param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _______________ TestCPUAdam.test_fused_adam_equal[1048576-fp32] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:27:16,050] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:27:16.052475 496457 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:27:16.052475 496317 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:27:16.053001 496317 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-6: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 80, in test_fused_adam_equal cpu_optimizer = DeepSpeedCPUAdam([cpu_param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _______________ TestCPUAdam.test_fused_adam_equal[1048576-fp16] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:27:20,738] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:27:20.740389 496622 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:27:20.740391 496482 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:27:20.740841 496482 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-7: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 80, in test_fused_adam_equal cpu_optimizer = DeepSpeedCPUAdam([cpu_param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _________________ TestCPUAdam.test_fused_adam_equal[1024-fp32] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:27:25,475] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:27:25.477136 496850 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:27:25.477134 496710 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:27:25.477574 496710 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-8: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 80, in test_fused_adam_equal cpu_optimizer = DeepSpeedCPUAdam([cpu_param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' __________________ TestCPUAdam.test_fused_adam_equal[22-fp32] __________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:27:30,202] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:27:30.204545 497015 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:27:30.204533 496875 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:27:30.204886 496875 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-9: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 80, in test_fused_adam_equal cpu_optimizer = DeepSpeedCPUAdam([cpu_param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' __________________ TestCPUAdam.test_fused_adam_equal[64-fp32] __________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:27:47,805] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:27:47.807761 497639 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:27:47.807761 497499 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:27:47.808174 497499 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-13: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 80, in test_fused_adam_equal cpu_optimizer = DeepSpeedCPUAdam([cpu_param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' __________________ TestCPUAdam.test_fused_adam_equal[64-fp16] __________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:27:56,792] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:27:56.794697 497957 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:27:56.794689 497817 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:27:56.795138 497817 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-15: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 80, in test_fused_adam_equal cpu_optimizer = DeepSpeedCPUAdam([cpu_param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _________________ TestCPUAdam.test_fused_adam_equal[128-fp16] __________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:28:05,868] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:28:05.870954 498275 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:28:05.870954 498135 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:28:05.871385 498135 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-17: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 80, in test_fused_adam_equal cpu_optimizer = DeepSpeedCPUAdam([cpu_param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _________________ TestCPUAdam.test_fused_adam_equal[128-fp32] __________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:28:14,788] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:28:14.791132 498593 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:28:14.791034 498453 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:28:14.791811 498453 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-19: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 80, in test_fused_adam_equal cpu_optimizer = DeepSpeedCPUAdam([cpu_param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' __________________ TestCPUAdam.test_fused_adam_equal[22-fp16] __________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:28:23,792] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:28:23.795222 498911 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:28:23.795218 498771 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:28:23.795699 498771 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-21: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adam/test_cpu_adam.py", line 80, in test_fused_adam_equal cpu_optimizer = DeepSpeedCPUAdam([cpu_param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/ops/adam/test_cpu_adam.py::TestCPUAdamGPUError::test_cpu_adam_gpu_error /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/ops/adam/test_cpu_adam.py::TestCPUAdamGPUError::test_cpu_adam_gpu_error /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 6.68s call unit/ops/adam/test_cpu_adam.py::TestCPUAdamGPUError::test_cpu_adam_gpu_error 5.43s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1024-fp32] 4.92s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1024-fp16] 4.81s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[22-fp32] 4.81s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[64-fp16] 4.72s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[22-fp16] 4.71s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1024-fp32] 4.71s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[22-fp16] 4.71s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1048576-fp16] 4.71s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1048576-fp32] 4.71s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[128-fp16] 4.61s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[128-fp32] 4.61s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[64-fp32] 4.41s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[128-fp16] 4.31s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1024-fp16] 4.31s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1048576-fp32] 4.31s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[22-fp32] 4.31s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[64-fp16] 4.31s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1048576-fp16] 4.21s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[128-fp32] 4.21s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[64-fp32] (42 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/ops/adam/test_cpu_adam.py::TestCPUAdamGPUError::test_cpu_adam_gpu_error FAILED unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1024-fp16] FAILED unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1048576-fp32] FAILED unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1048576-fp16] FAILED unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1024-fp32] FAILED unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[22-fp32] FAILED unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[64-fp32] FAILED unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[64-fp16] FAILED unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[128-fp16] FAILED unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[128-fp32] FAILED unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[22-fp16] ============ 11 failed, 10 passed, 3 warnings in 100.88s (0:01:40) ============= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2927879481 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 64 items / 16 deselected / 48 selected unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-False-True] FAILED [ 2%] unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-True-True] FAILED [ 4%] unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-False-False] FAILED [ 6%] unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-True-True] FAILED [ 8%] unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-False-False] FAILED [ 10%] unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-True-True] FAILED [ 12%] unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-False-False] FAILED [ 14%] unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-False-True] FAILED [ 16%] unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-True-False] FAILED [ 18%] unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-False-True] FAILED [ 20%] unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-False-True] FAILED [ 22%] unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-True-True] FAILED [ 25%] unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-False-False] FAILED [ 27%] unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-True-False] FAILED [ 29%] unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-True-False] FAILED [ 31%] unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-False-True] FAILED [ 33%] unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-True-True] FAILED [ 35%] unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-False-False] FAILED [ 37%] unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-True-True] FAILED [ 39%] unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-False-True] FAILED [ 41%] unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-False-False] FAILED [ 43%] unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-True-False] FAILED [ 45%] unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-True-False] FAILED [ 47%] unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-True-False] FAILED [ 50%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-False-True] FAILED [ 52%] unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-True-False] FAILED [ 54%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-False-False] FAILED [ 56%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-True-False] FAILED [ 58%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-True-True] FAILED [ 60%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-True-False] FAILED [ 62%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-False-True] FAILED [ 64%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-True-True] FAILED [ 66%] unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-True-True] FAILED [ 68%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-False-True] FAILED [ 70%] unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-False-False] FAILED [ 72%] unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-True-True] FAILED [ 75%] unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-False-False] FAILED [ 77%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-True-True] FAILED [ 79%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-True-False] FAILED [ 81%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-True-False] FAILED [ 83%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-True-True] FAILED [ 85%] unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-False-True] FAILED [ 87%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-False-False] FAILED [ 89%] unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-False-True] FAILED [ 91%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-False-False] FAILED [ 93%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-False-True] FAILED [ 95%] unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-False-False] FAILED [ 97%] unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-True-False] FAILED [100%] =================================== FAILURES =================================== _________________ TestRead.test_parallel_read[True-False-True] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:28:36,901] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:28:36.903471 499200 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:28:36.903486 499329 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:28:36.903791 499200 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-1: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 94, in test_parallel_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestRead.test_async_read[True-False-True-True] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:28:40,395] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:28:40.397357 499471 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:28:40.397351 499342 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:28:40.397672 499342 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-2: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestRead.test_parallel_read[True-False-False] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:28:43,927] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:28:43.929081 499613 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:28:43.929071 499484 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:28:43.929389 499484 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-3: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 94, in test_parallel_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _________________ TestRead.test_parallel_read[False-True-True] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:28:47,446] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:28:47.447726 499755 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:28:47.447724 499626 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:28:47.448045 499626 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-4: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 94, in test_parallel_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestRead.test_async_read[True-True-False-False] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:28:51,018] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:28:51.020327 499897 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:28:51.020316 499768 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:28:51.020874 499768 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-5: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestRead.test_async_read[True-True-True-True] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:28:54,474] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:28:54.476365 500039 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:28:54.476356 499910 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:28:54.476680 499910 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-6: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestRead.test_async_read[True-False-False-False] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:28:58,007] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:28:58.008978 500181 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:28:58.008970 500052 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:28:58.009289 500052 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-7: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestRead.test_async_read[False-False-False-True] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:01,519] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:01.520992 500323 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:01.520982 500194 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:01.521299 500194 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-8: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestRead.test_parallel_read[False-True-False] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:05,112] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:05.114812 500465 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:05.114802 500336 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:05.115303 500336 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-9: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 94, in test_parallel_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestRead.test_async_read[True-False-False-True] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:08,644] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:08.645887 500607 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:08.645877 500478 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:08.646190 500478 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-10: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestRead.test_async_read[False-True-False-True] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:12,184] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:12.186090 500620 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:12.186107 500749 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:12.186399 500620 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-11: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestRead.test_async_read[False-True-True-True] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:15,727] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:15.729334 500891 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:15.729334 500762 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:15.729833 500762 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-12: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ______________ TestRead.test_async_read[False-False-False-False] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:19,208] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:19.211542 501033 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:19.211529 500904 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:19.212108 500904 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-13: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestRead.test_async_read[False-True-True-False] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:22,765] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:22.767371 501175 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:22.767364 501046 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:22.767691 501046 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-14: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestRead.test_async_read[False-False-True-False] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:26,297] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:26.300590 501317 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:26.300571 501188 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:26.301123 501188 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-15: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestRead.test_parallel_read[False-False-True] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:29,838] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:29.840524 501459 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:29.840524 501330 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:29.841032 501330 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-16: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 94, in test_parallel_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestRead.test_async_read[False-False-True-True] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:33,358] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:33.359935 501601 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:33.359926 501472 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:33.360242 501472 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-17: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestRead.test_parallel_read[False-False-False] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:36,982] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:36.984362 501743 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:36.984362 501614 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:36.984858 501614 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-18: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 94, in test_parallel_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _________________ TestRead.test_parallel_read[True-True-True] __________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:40,541] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:40.548139 501885 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:40.548128 501756 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:40.548662 501756 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-19: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 94, in test_parallel_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestRead.test_async_read[True-True-False-True] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:44,113] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:44.119480 502027 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:44.119482 501898 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:44.120364 501898 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-20: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestRead.test_async_read[False-True-False-False] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:47,543] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:47.544853 502169 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:47.544845 502040 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:47.545162 502040 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-21: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestRead.test_async_read[True-True-True-False] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:51,040] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:51.043380 502311 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:51.043372 502182 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:51.043903 502182 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-22: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestRead.test_async_read[True-False-True-False] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:54,615] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:54.617552 502453 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:54.617552 502324 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:54.617990 502324 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-23: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 119, in test_async_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _________________ TestRead.test_parallel_read[True-True-False] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:29:58,077] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:29:58.079737 502595 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:29:58.079721 502466 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:29:58.080271 502466 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-24: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 94, in test_parallel_read h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ______________ TestWrite.test_async_write[False-False-False-True] ______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:01,586] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:01.588467 502737 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:01.588459 502608 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:01.588779 502608 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-25: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestWrite.test_parallel_write[False-True-False] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:05,117] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:05.119194 502879 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:05.119187 502750 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:05.119503 502750 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-26: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 160, in test_parallel_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _____________ TestWrite.test_async_write[False-False-False-False] ______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:08,671] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:08.672839 503021 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:08.672827 502892 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:08.673151 502892 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-27: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ______________ TestWrite.test_async_write[True-False-True-False] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:12,260] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:12.262622 503163 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:12.262617 503034 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:12.263334 503034 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-28: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestWrite.test_async_write[True-True-True-True] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:15,819] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:15.822098 503305 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:15.822090 503176 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:15.822592 503176 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-29: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestWrite.test_async_write[True-True-True-False] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:19,298] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:19.299579 503447 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:19.299571 503318 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:19.299890 503318 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-30: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestWrite.test_async_write[True-True-False-True] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:22,771] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:22.774380 503589 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:22.774369 503460 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:22.774957 503460 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-31: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ______________ TestWrite.test_async_write[False-False-True-True] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:26,262] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:26.264678 503731 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:26.264667 503602 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:26.265398 503602 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-32: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestWrite.test_parallel_write[False-True-True] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:29,961] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:29.963265 503873 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:29.963265 503744 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:29.963798 503744 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-33: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 160, in test_parallel_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ______________ TestWrite.test_async_write[True-False-False-True] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:33,544] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:33.546448 504015 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:33.546448 503886 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:33.546929 503886 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-34: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestWrite.test_parallel_write[False-False-False] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:37,172] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:37.174662 504157 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:37.174652 504028 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:37.175184 504028 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-35: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 160, in test_parallel_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestWrite.test_parallel_write[True-True-True] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:40,690] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:40.692971 504299 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:40.692971 504170 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:40.693413 504170 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-36: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 160, in test_parallel_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestWrite.test_parallel_write[True-False-False] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:44,306] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:44.308230 504441 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:44.308212 504312 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:44.308750 504312 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-37: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 160, in test_parallel_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestWrite.test_async_write[False-True-True-True] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:47,807] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:47.808560 504583 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:47.808552 504454 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:47.808872 504454 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-38: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ______________ TestWrite.test_async_write[False-True-True-False] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:51,395] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:51.398605 504725 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:51.398591 504596 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:51.408833 504596 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-39: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ______________ TestWrite.test_async_write[False-False-True-False] ______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:54,920] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:54.922894 504867 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:54.922874 504738 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:54.923380 504738 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-40: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestWrite.test_async_write[True-False-True-True] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:30:58,362] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:30:58.364070 505009 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:30:58.364061 504880 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:30:58.364378 504880 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-41: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini _______________ TestWrite.test_parallel_write[False-False-True] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:31:01,882] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:31:01.884245 505022 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:31:01.884258 505151 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:31:01.884559 505022 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-42: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 160, in test_parallel_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ______________ TestWrite.test_async_write[False-True-False-False] ______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:31:05,471] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:31:05.473610 505293 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:31:05.473610 505164 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:31:05.474048 505164 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-43: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestWrite.test_parallel_write[True-False-True] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:31:08,905] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:31:08.906850 505435 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:31:08.906841 505306 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:31:08.907162 505306 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-44: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 160, in test_parallel_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ______________ TestWrite.test_async_write[True-True-False-False] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:31:12,523] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:31:12.525455 505577 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:31:12.525455 505448 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:31:12.525900 505448 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-45: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ______________ TestWrite.test_async_write[False-True-False-True] _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:31:16,148] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:31:16.150579 505719 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:31:16.150569 505590 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:31:16.151086 505590 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-46: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ______________ TestWrite.test_async_write[True-False-False-False] ______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:31:19,604] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:31:19.610383 505861 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:31:19.610380 505732 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:31:19.611284 505732 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-47: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 186, in test_async_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini ________________ TestWrite.test_parallel_write[True-True-False] ________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:31:23,088] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:31:23.089818 506003 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:31:23.089809 505874 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:31:23.090131 505874 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-48: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/aio/test_aio.py", line 160, in test_parallel_write h = AsyncIOBuilder().load().aio_handle(BLOCK_SIZE, QUEUE_DEPTH, single_submit, overlap_events, IO_PARALLEL) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/aio/async_io_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-False-True] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-False-True] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 3.71s call unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-True-True] 3.61s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-True-False] 3.61s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-False-True] 3.61s call unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-True-True] 3.61s call unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-False-False] 3.61s call unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-True-False] 3.61s call unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-False-True] 3.61s call unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-False-True] 3.61s call unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-False-False] 3.61s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-False-False] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-False-False] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-False-False] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-True-False] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-False-True] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-True-True] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-True-True] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-False-True] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-True-True] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-True-True] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-True-False] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-False-True] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-False-False] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-True-False] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-True-False] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-False-True] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-False-False] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-False-False] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-False-False] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-True-True] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-False-False] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-True-True] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-False-True] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-True-True] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-False-False] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-True-False] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-False-False] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-True-True] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-True-True] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-False-True] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-False-True] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-True-False] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-False-True] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-True-False] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-True-False] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-True-False] 3.51s call unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-True-False] 3.51s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-True-True] 3.41s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-False-True] (96 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-False-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-True-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-False-False] FAILED unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-True-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-False-False] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-True-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-False-False] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-False-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-True-False] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-False-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-False-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-True-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-False-False] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-True-False] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-True-False] FAILED unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-False-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-True-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_parallel_read[False-False-False] FAILED unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-True-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-False-True] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-False-False] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[True-True-True-False] FAILED unit/ops/aio/test_aio.py::TestRead::test_async_read[True-False-True-False] FAILED unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-True-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-False-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-True-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-False-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-True-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-True-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-True-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-False-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-True-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-True-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-False-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-False-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-True-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-False-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-True-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-True-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-True-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-True-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-False-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-False-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-False-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-False-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-False-True] FAILED unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-False-False] FAILED unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-True-False] ========== 48 failed, 16 deselected, 3 warnings in 171.28s (0:02:51) =========== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1432730917 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 4 items / 4 deselected / 0 selected ============================ 4 deselected in 0.90s ============================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=4077121693 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 240 items / 240 deselected / 0 selected =========================== 240 deselected in 0.95s ============================ ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=4106365716 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 378 items / 378 deselected / 0 selected =========================== 378 deselected in 0.96s ============================ ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2209334758 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 36 items / 36 deselected / 0 selected ============================ 36 deselected in 0.91s ============================ ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1682582430 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 36 items / 36 deselected / 0 selected ============================ 36 deselected in 0.90s ============================ ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3572293986 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 576 items / 576 deselected / 0 selected =========================== 576 deselected in 1.02s ============================ ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=4189940279 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 36 items / 36 deselected / 0 selected ============================ 36 deselected in 0.90s ============================ ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2579475170 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 36 items / 36 deselected / 0 selected ============================ 36 deselected in 0.91s ============================ ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1672616031 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 168 items / 168 deselected / 0 selected =========================== 168 deselected in 0.94s ============================ ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=554045017 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 8 items / 8 deselected / 0 selected ============================ 8 deselected in 0.91s ============================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3737174781 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 5 items unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-160-128-2-24-False-True-0.2] FAILED [ 20%] unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-25-3-True-True-0.05] FAILED [ 40%] unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-160-128-2-3-True-True-0.1] PASSED [ 60%] unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-2-3-True-True-0.05] FAILED [ 80%] unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-1600-128-2-4-False-True-0.2] FAILED [100%] =================================== FAILURES =================================== ________ TestCUDABackward.test_backward[64-160-128-2-24-False-True-0.2] ________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. [2023-05-27 03:32:03,826] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl layer #0 is created with date type [half]. layer #1 is created with date type [half]. layer #2 is created with date type [half]. layer #3 is created with date type [half]. layer #4 is created with date type [half]. layer #5 is created with date type [half]. layer #6 is created with date type [half]. layer #7 is created with date type [half]. layer #8 is created with date type [half]. layer #9 is created with date type [half]. layer #10 is created with date type [half]. layer #11 is created with date type [half]. layer #12 is created with date type [half]. layer #13 is created with date type [half]. DeepSpeed Transformer config is {'layer_id': 0, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 1, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 2, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 3, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 4, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 5, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 6, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 7, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 8, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 9, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 10, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 11, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 12, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 13, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 14, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False}layer #14 is created with date type [half]. layer #15 is created with date type [half]. layer #16 is created with date type [half]. layer #17 is created with date type [half]. layer #18 is created with date type [half]. layer #19 is created with date type [half]. layer #20 is created with date type [half]. layer #21 is created with date type [half]. layer #22 is created with date type [half]. layer #23 is created with date type [half]. DeepSpeed Transformer config is {'layer_id': 15, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 16, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 17, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 18, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 19, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 20, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 21, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 22, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 23, 'batch_size': 64, 'hidden_size': 160, 'intermediate_size': 160, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 24, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_B norm_W N2_B out_B norm_B out_W norm_W int_B out_B int_W out_W N2_B int_B N2_W N2_W O_B int_W O_W O_B V_B V_W V_W O_W (0, 'hidden_state') (0, 'hidden_state') (0, 'norm_B') (0, 'V_W') (0, 'norm_W') (0, 'V_B') (0, 'out_B') (0, 'N2_B') (0, 'out_W') (0, 'norm_B') (0, 'int_B') (0, 'norm_W') (0, 'int_W') (0, 'out_B') (0, 'N2_B') (0, 'out_W') (0, 'N2_W') (0, 'int_B') (0, 'O_B') (0, 'N2_W') (0, 'O_W') (0, 'O_W') (0, 'V_B') (0, 'int_W') (0, 'V_W') (0, 'O_B') (1, 'hidden_state') (1, 'hidden_state') (1, 'norm_B') (1, 'V_W') (1, 'norm_W') (1, 'V_B') (1, 'out_B') (1, 'N2_B') (1, 'out_W') (1, 'norm_B') (1, 'int_B') (1, 'norm_W') (1, 'int_W') (1, 'out_B') (1, 'N2_B') (1, 'out_W') (1, 'N2_W') (1, 'int_B') (1, 'O_B') (1, 'N2_W') (1, 'O_W') (1, 'O_W') (1, 'V_B') (1, 'int_W') (1, 'V_W') (1, 'O_B') (2, 'hidden_state') (2, 'hidden_state') (2, 'norm_B') (2, 'V_W') (2, 'norm_W') (2, 'V_B') (2, 'out_B') (2, 'N2_B') (2, 'out_W') (2, 'norm_B') (2, 'int_B') (2, 'norm_W') (2, 'int_W') (2, 'out_B') (2, 'N2_B') (2, 'out_W') (2, 'N2_W') (2, 'int_B') (2, 'O_B') (2, 'N2_W') (2, 'O_W') (2, 'O_W') (2, 'V_B') (2, 'int_W') (2, 'V_W') (2, 'O_B') (3, 'hidden_state') (3, 'hidden_state') (3, 'norm_B') (3, 'V_W') (3, 'norm_W') (3, 'V_B') (3, 'out_B') (3, 'N2_B') (3, 'out_W') (3, 'norm_B') (3, 'int_B') (3, 'norm_W') (3, 'int_W') (3, 'out_B') (3, 'N2_B') (3, 'out_W') (3, 'N2_W') (3, 'int_B') (3, 'O_B') (3, 'N2_W') (3, 'O_W') (3, 'O_W') (3, 'V_B') (3, 'int_W') (3, 'V_W') (3, 'O_B') (4, 'hidden_state') (4, 'hidden_state') (4, 'norm_B') (4, 'V_W') (4, 'norm_W') (4, 'V_B') (4, 'out_B') (4, 'N2_B') (4, 'out_W') (4, 'norm_B') (4, 'int_B') (4, 'norm_W') (4, 'int_W') (4, 'out_B') (4, 'N2_B') (4, 'out_W') (4, 'N2_W') (4, 'int_B') (4, 'O_B') (4, 'N2_W') (4, 'O_W') (4, 'O_W') (4, 'V_B') (4, 'int_W') (4, 'V_W') (4, 'O_B') (5, 'hidden_state') (5, 'hidden_state') (5, 'norm_B') (5, 'V_W') (5, 'norm_W') (5, 'V_B') (5, 'out_B') (5, 'N2_B') (5, 'out_W') (5, 'norm_B') (5, 'int_B') (5, 'norm_W') (5, 'int_W') (5, 'out_B') (5, 'N2_B') (5, 'out_W') (5, 'N2_W') (5, 'int_B') (5, 'O_B') (5, 'N2_W') (5, 'O_W') (5, 'O_W') (5, 'V_B') (5, 'int_W') (5, 'V_W') (5, 'O_B') (6, 'hidden_state') (6, 'hidden_state') (6, 'norm_B') (6, 'V_W') (6, 'norm_W') (6, 'V_B') (6, 'out_B') (6, 'N2_B') (6, 'out_W') (6, 'norm_B') (6, 'int_B') (6, 'norm_W') (6, 'int_W') (6, 'out_B') (6, 'N2_B') (6, 'out_W') (6, 'N2_W') (6, 'int_B') (6, 'O_B') (6, 'N2_W') (6, 'O_W') (6, 'O_W') (6, 'V_B') (6, 'int_W') (6, 'V_W') (6, 'O_B') (7, 'hidden_state') (7, 'hidden_state') (7, 'norm_B') (7, 'V_W') (7, 'norm_W') (7, 'V_B') (7, 'out_B') (7, 'N2_B') (7, 'out_W') (7, 'norm_B') (7, 'int_B') (7, 'norm_W') (7, 'int_W') (7, 'out_B') (7, 'N2_B') (7, 'out_W') (7, 'N2_W') (7, 'int_B') (7, 'O_B') (7, 'N2_W') (7, 'O_W') (7, 'O_W') (7, 'V_B') (7, 'int_W') (7, 'V_W') (7, 'O_B') (8, 'hidden_state') (8, 'hidden_state') (8, 'norm_B') (8, 'V_W') (8, 'norm_W') (8, 'V_B') (8, 'out_B') (8, 'N2_B') (8, 'out_W') (8, 'norm_B') (8, 'int_B') (8, 'norm_W') (8, 'int_W') (8, 'out_B') (8, 'N2_B') (8, 'out_W') (8, 'N2_W') (8, 'int_B') (8, 'O_B') (8, 'N2_W') (8, 'O_W') (8, 'O_W') (8, 'V_B') (8, 'int_W') (8, 'V_W') (8, 'O_B') (9, 'hidden_state') (9, 'hidden_state') (9, 'norm_B') (9, 'V_W') (9, 'norm_W') (9, 'V_B') (9, 'out_B') (9, 'N2_B') (9, 'out_W') (9, 'norm_B') (9, 'int_B') (9, 'norm_W') (9, 'int_W') (9, 'out_B') (9, 'N2_B') (9, 'out_W') (9, 'N2_W') (9, 'int_B') (9, 'O_B') (9, 'N2_W') (9, 'O_W') (9, 'O_W') (9, 'V_B') (9, 'int_W') (9, 'V_W') (9, 'O_B') (10, 'hidden_state') (10, 'hidden_state') (10, 'norm_B') (10, 'V_W') (10, 'norm_W') (10, 'V_B') (10, 'out_B') (10, 'N2_B') (10, 'out_W') (10, 'norm_B') (10, 'int_B') (10, 'norm_W') (10, 'int_W') (10, 'out_B') (10, 'N2_B') (10, 'out_W') (10, 'N2_W') (10, 'int_B') (10, 'O_B') (10, 'N2_W') (10, 'O_W') (10, 'O_W') (10, 'V_B') (10, 'int_W') (10, 'V_W') (10, 'O_B') (11, 'hidden_state') (11, 'hidden_state') (11, 'norm_B') (11, 'V_W') (11, 'norm_W') (11, 'V_B') (11, 'out_B') (11, 'N2_B') (11, 'out_W') (11, 'norm_B') (11, 'int_B') (11, 'norm_W') (11, 'int_W') (11, 'out_B') (11, 'N2_B') (11, 'out_W') (11, 'N2_W') (11, 'int_B') (11, 'O_B') (11, 'N2_W') (11, 'O_W') (11, 'O_W') (11, 'V_B') (11, 'int_W') (11, 'V_W') (11, 'O_B') (12, 'hidden_state') (12, 'hidden_state') (12, 'norm_B') (12, 'V_W') (12, 'norm_W') (12, 'V_B') (12, 'out_B') (12, 'N2_B') (12, 'out_W') (12, 'norm_B') (12, 'int_B') (12, 'norm_W') (12, 'int_W') (12, 'out_B') (12, 'N2_B') (12, 'out_W') (12, 'N2_W') (12, 'int_B') (12, 'O_B') (12, 'N2_W') (12, 'O_W') (12, 'O_W') (12, 'V_B') (12, 'int_W') (12, 'V_W') (12, 'O_B') (13, 'hidden_state') (13, 'hidden_state') (13, 'norm_B') (13, 'V_W') (13, 'norm_W') (13, 'V_B') (13, 'out_B') (13, 'N2_B') (13, 'out_W') (13, 'norm_B') (13, 'int_B') (13, 'norm_W') (13, 'int_W') (13, 'out_B') (13, 'N2_B') (13, 'out_W') (13, 'N2_W') (13, 'int_B') (13, 'O_B') (13, 'N2_W') (13, 'O_W') (13, 'O_W') (13, 'V_B') (13, 'int_W') (13, 'V_W') (13, 'O_B') (14, 'hidden_state') (14, 'hidden_state') (14, 'norm_B') (14, 'V_W') (14, 'norm_W') (14, 'V_B') (14, 'out_B') (14, 'N2_B') (14, 'out_W') (14, 'norm_B') (14, 'int_B') (14, 'norm_W') (14, 'int_W') (14, 'out_B') (14, 'N2_B') (14, 'out_W') (14, 'N2_W') (14, 'int_B') (14, 'O_B') (14, 'N2_W') (14, 'O_W') (14, 'O_W') (14, 'V_B') (14, 'int_W') (14, 'V_W') (14, 'O_B') (15, 'hidden_state') (15, 'hidden_state') (15, 'norm_B') (15, 'V_W') (15, 'norm_W') (15, 'V_B') (15, 'out_B') (15, 'N2_B') (15, 'out_W') (15, 'norm_B') (15, 'int_B') (15, 'norm_W') (15, 'int_W') (15, 'out_B') (15, 'N2_B') (15, 'out_W') (15, 'N2_W') (15, 'int_B') (15, 'O_B') (15, 'N2_W') (15, 'O_W') (15, 'O_W') (15, 'V_B') (15, 'int_W') (15, 'V_W') (15, 'O_B') (16, 'hidden_state') (16, 'hidden_state') (16, 'norm_B') (16, 'V_W') (16, 'norm_W') (16, 'V_B') (16, 'out_B') (16, 'N2_B') (16, 'out_W') (16, 'norm_B') (16, 'int_B') (16, 'norm_W') (16, 'int_W') (16, 'out_B') (16, 'N2_B') (16, 'out_W') (16, 'N2_W') (16, 'int_B') (16, 'O_B') (16, 'N2_W') (16, 'O_W') (16, 'O_W') (16, 'V_B') (16, 'int_W') (16, 'V_W') (16, 'O_B') (17, 'hidden_state') (17, 'hidden_state') (17, 'norm_B') (17, 'V_W') (17, 'norm_W') (17, 'V_B') (17, 'out_B') (17, 'N2_B') (17, 'out_W') (17, 'norm_B') (17, 'int_B') (17, 'norm_W') (17, 'int_W') (17, 'out_B') (17, 'N2_B') (17, 'out_W') (17, 'N2_W') (17, 'int_B') (17, 'O_B') (17, 'N2_W') (17, 'O_W') (17, 'O_W') (17, 'V_B') (17, 'int_W') (17, 'V_W') (17, 'O_B') (18, 'hidden_state') (18, 'hidden_state') (18, 'norm_B') (18, 'V_W') (18, 'norm_W') (18, 'V_B') (18, 'out_B') (18, 'N2_B') (18, 'out_W') (18, 'norm_B') (18, 'int_B') (18, 'norm_W') (18, 'int_W') (18, 'out_B') (18, 'N2_B') (18, 'out_W') (18, 'N2_W') (18, 'int_B') (18, 'O_B') (18, 'N2_W') (18, 'O_W') (18, 'O_W') (18, 'V_B') (18, 'int_W') (18, 'V_W') (18, 'O_B') (19, 'hidden_state') (19, 'hidden_state') (19, 'norm_B') (19, 'V_W') (19, 'norm_W') (19, 'V_B') (19, 'out_B') (19, 'N2_B') (19, 'out_W') (19, 'norm_B') (19, 'int_B') (19, 'norm_W') (19, 'int_W') (19, 'out_B') (19, 'N2_B') (19, 'out_W') (19, 'N2_W') (19, 'int_B') (19, 'O_B') (19, 'N2_W') (19, 'O_W') (19, 'O_W') (19, 'V_B') (19, 'int_W') (19, 'V_W') (19, 'O_B') (20, 'hidden_state') (20, 'hidden_state') (20, 'norm_B') (20, 'V_W') (20, 'norm_W') (20, 'V_B') (20, 'out_B') (20, 'N2_B') (20, 'out_W') (20, 'norm_B') (20, 'int_B') (20, 'norm_W') (20, 'int_W') (20, 'out_B') (20, 'N2_B') (20, 'out_W') (20, 'N2_W') (20, 'int_B') (20, 'O_B') (20, 'N2_W') (20, 'O_W') (20, 'O_W') (20, 'V_B') (20, 'int_W') (20, 'V_W') (20, 'O_B') (21, 'hidden_state') (21, 'hidden_state') (21, 'norm_B') (21, 'V_W') (21, 'norm_W') (21, 'V_B') (21, 'out_B') (21, 'N2_B') (21, 'out_W') (21, 'norm_B') (21, 'int_B') (21, 'norm_W') (21, 'int_W') (21, 'out_B') (21, 'N2_B') (21, 'out_W') (21, 'N2_W') (21, 'int_B') (21, 'O_B') (21, 'N2_W') (21, 'O_W') (21, 'O_W') (21, 'V_B') (21, 'int_W') (21, 'V_W') (21, 'O_B') (22, 'hidden_state') (22, 'hidden_state') (22, 'norm_B') (22, 'V_W') (22, 'norm_W') (22, 'V_B') (22, 'out_B') (22, 'N2_B') (22, 'out_W') (22, 'norm_B') (22, 'int_B') (22, 'norm_W') (22, 'int_W') (22, 'out_B') (22, 'N2_B') (22, 'out_W') (22, 'N2_W') (22, 'int_B') (22, 'O_B') (22, 'N2_W') (22, 'O_W') (22, 'O_W') (22, 'V_B') (22, 'int_W') (22, 'V_W') (22, 'O_B') (23, 'hidden_state') (23, 'hidden_state') (23, 'norm_B') (23, 'V_B') (23, 'norm_W') (23, 'N2_B') (23, 'out_B') (23, 'norm_B') (23, 'out_W') (23, 'norm_W') (23, 'int_B') (23, 'out_B') (23, 'int_W') (23, 'out_W') (23, 'N2_B') (23, 'int_B') (23, 'N2_W') (23, 'N2_W') (23, 'O_B') (23, 'int_W') (23, 'O_W') (23, 'O_B') (23, 'V_B') (23, 'V_W') (23, 'V_W') (23, 'O_W') checking hidden_state : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.0070497072159651 x = [ 0.01341 0.02847 -0.002014 ... -0.00935 0.03302 0.06128 ] y = [ 0.01344 0.02838 -0.002136 ... -0.00924 0.03293 0.06122 ] 50.579186095372435 50.57799403878469 -------------------------------------------------------------------------------- checking norm_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 5.376532287597657 x = [-10.555 52.25 -24.7 -5.74 -14.875 -37.47 -46. -53.5 -49.3 40.38 -3.281 -14.234 -3.654 -42.3 -74.56 -17.12 -33.47 -25.94 -65.2 -13.836 47.78 -3.318 30.23 10.02 31.7 21.33 -10.77 27.16 6.625 10.414 42.62 -10.664 -10.02 -21.72 -29.55 -10.79 -55.53 2.758 52.34 -22.75 -43.66 35.62 33.12 -18.25 0.914 58.56 -72.75 -0.4404 -85.1 1.114 15.195 -10.22 -15.73 27.11 -69.3 9.03 -56.16 -43.5 -25.89 67.9 25.62 26.61 2.781 -17.64 -3.709 -0.5347 16.84 -42.47 17.62 -9. -56.47 -9. 69.44 43.1 -3.854 43.22 -24.47 -8.01 22.97 0.2192 38.7 -21.55 8.766 -26.42 -11.805 -2.201 12.164 27. 103.7 -20.3 14.34 -12.875 -39.66 -20. 50.3 11.484 -33.84 45.3 17.62 14.266 28.8 41.44 -56.28 16.7 30.47 32.78 5.473 38.3 -40.97 20.22 -37.56 -42.66 -2.965 12.14 -19.67 -16.02 -10.08 -2.994 -22.28 23.69 -55.28 -1.475 62.3 -48.84 9.31 82.4 87.8 13.86 22.28 -26.42 44.72 -44.75 33.34 -11.72 -10.07 10.82 -24.95 -13.62 25.14 16.62 -29.66 -9.16 -18.05 14.44 -12.79 -15.64 0.39 27.36 45.6 26.33 7.242 -0.5303 -1.845 56.56 28.19 12.59 -39.56 -3.068 -18.36 34.72 ] y = [-10.55 52.22 -24.69 -5.73 -14.88 -37.44 -45.97 -53.5 -49.28 40.34 -3.281 -14.21 -3.645 -42.3 -74.56 -17.11 -33.47 -25.92 -65.1 -13.84 47.78 -3.326 30.23 10.016 31.72 21.33 -10.75 27.16 6.64 10.43 42.6 -10.67 -10.03 -21.73 -29.53 -10.81 -55.53 2.754 52.3 -22.77 -43.66 35.6 33.12 -18.28 0.915 58.56 -72.7 -0.4568 -85.1 1.144 15.18 -10.25 -15.734 27.12 -69.3 9.055 -56.12 -43.47 -25.9 67.9 25.62 26.6 2.78 -17.64 -3.734 -0.558 16.83 -42.44 17.64 -8.98 -56.47 -8.984 69.4 43.06 -3.855 43.2 -24.44 -8. 22.97 0.2029 38.66 -21.53 8.75 -26.4 -11.78 -2.182 12.19 27. 103.6 -20.3 14.336 -12.89 -39.62 -20. 50.3 11.47 -33.8 45.28 17.64 14.25 28.8 41.44 -56.22 16.7 30.45 32.75 5.457 38.28 -40.97 20.2 -37.53 -42.62 -2.953 12.14 -19.67 -16.02 -10.09 -2.98 -22.27 23.66 -55.22 -1.443 62.28 -48.8 9.28 82.4 87.75 13.88 22.28 -26.4 44.72 -44.72 33.3 -11.72 -10.086 10.83 -24.97 -13.6 25.16 16.61 -29.62 -9.16 -18.06 14.43 -12.77 -15.63 0.3806 27.34 45.56 26.31 7.25 -0.503 -1.862 56.56 28.17 12.6 -39.53 -3.064 -18.34 34.72 ] 428.7473985165588 428.57613294200297 -------------------------------------------------------------------------------- checking norm_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 51.1353125 x = [318.5 303.8 242.8 225.1 259.5 304.8 235.4 267. 344.8 251.6 211.4 242. 234.4 227.2 224.4 229.8 313.2 264.5 289.8 255.4 271.2 288.2 249.5 225.9 241.6 264.5 197.1 285.2 242.1 241.5 223.6 288. 187. 323.2 203.4 261.2 272. 244.1 263. 351. 253.1 287.8 244.5 264.2 268.8 299.5 288.8 307.8 266.5 239.4 243.5 248.9 218.4 263.5 307.2 259. 253.2 312.8 248.5 258. 223.8 256.5 268.8 205.1 265.2 190.4 219.8 257.5 218.9 271.2 242.4 231. 289.5 281.5 267.8 291.5 240.9 236.5 279.8 244.4 253.4 217.5 247. 259.2 240.8 267.2 237.1 277.5 270.2 288.2 267.2 214.8 250.1 242. 318.5 225.6 234.6 278.8 290.8 273.8 245.1 261.2 250.2 328. 204.9 244.8 261.8 275.8 302.5 232.4 283.2 225.6 267.8 256. 214.1 221.8 244.5 220.9 305. 237.2 264. 255.8 227. 180.8 233.2 240.2 278. 294.2 273.5 286.8 236.5 278. 252.6 251. 271.8 264.2 261.5 248.6 255.8 243.4 236. 255.1 242.2 254.5 240.8 261. 208.9 230.5 274. 238.1 230.8 240.2 214.2 279.2 243.6 264.5 229.4 269.5 220.1 257.2] y = [318.5 303.8 242.8 225.2 259.5 304.5 235.4 267. 344.8 251.6 211.4 242. 234.4 227.2 224.2 229.8 313.2 264.5 289.8 255.4 271.2 288.2 249.5 225.8 241.6 264.5 197.1 285.2 242.1 241.4 223.6 288. 187. 323.2 203.4 261.2 272. 244. 263. 351. 253.1 287.8 244.5 264. 268.8 299.5 288.8 307.8 266.5 239.2 243.5 248.8 218.4 263.5 307. 258.8 253.2 312.8 248.5 257.8 223.6 256.5 268.8 205.1 265.2 190.4 219.8 257.5 218.9 271.2 242.2 230.9 289.5 281.5 267.8 291.5 240.9 236.5 279.8 244.4 253.2 217.5 247. 259.2 240.8 267.2 237.1 277.5 270.2 288.2 267.2 214.8 250.1 242. 318.5 225.6 234.6 278.8 290.8 273.8 245.1 261. 250.1 328. 204.9 244.6 261.8 275.8 302.5 232.4 283.2 225.5 267.8 256. 214.1 221.8 244.5 220.9 305. 237.1 264. 255.8 227. 180.8 233.2 240.1 277.8 294. 273.5 286.8 236.5 277.8 252.6 251. 271.8 264.2 261.5 248.6 255.6 243.4 235.9 255.1 242.2 254.5 240.8 261. 208.8 230.5 274. 238. 230.8 240.2 214.2 279.2 243.6 264.5 229.4 269.5 220. 257.2] 3256.4926109235994 3256.127456907208 -------------------------------------------------------------------------------- checking out_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4185819435119629 x = [-1.515 -3.979 -0.804 1.895 0.207 1.011 0.126 -3.291 -4.2 -2.717 -1.662 2.094 2.102 -3.467 2.162 -0.6797 4.105 -2.512 1.965 0.553 0.0795 2.553 -2.557 0.4575 1.705 2.752 -3.193 6.41 6.152 2.523 -4.496 0.1752 -4.344 -3.584 -1.465 -2.94 1.322 -3.51 3.164 0.934 -1.699 3.754 3.088 2.846 -3.676 -1.317 -0.8267 4.598 0.1941 -1.615 1.011 -2.594 0.0391 1.678 -1.2705 -1.98 -0.989 -0.1709 -1.343 -3.115 2.152 0.9585 1.019 2.502 0.791 -3.648 -0.5903 -1.216 -2.24 4.367 -0.3892 -0.931 1.724 -2.713 1.099 -0.82 -0.375 1.171 2.037 3.814 0.5977 -1.901 -4.066 1.817 0.6587 -0.0211 2.492 -4.613 2.041 -2.55 0.05435 -1.644 -0.571 -0.721 -1.316 -0.2805 2.826 2.795 -1.941 3.926 2.434 0.5107 2.686 -2.842 0.96 0.2505 0.0616 0.943 5.75 -2.506 1.875 -0.642 0.1178 -3.086 -0.4429 0.463 1.484 4.035 0.505 2.225 -2.738 1.284 0.5107 0.5405 -1.572 -0.516 -1.023 0.221 1.746 -5.754 -6.465 4.656 -1.614 -2.014 -2.828 3.857 -0.3547 0.731 4.363 -1.867 -5.555 -4.867 1.598 1.054 -3.998 2.008 2. 3.045 -1.925 1.441 -0.565 -0.644 -2.375 -2.994 -4.223 4.395 0.9565 5.523 1.761 0.908 ] y = [-1.512 -3.969 -0.7983 1.899 0.2131 1.012 0.1224 -3.293 -4.207 -2.717 -1.661 2.092 2.105 -3.469 2.162 -0.6816 4.1 -2.512 1.964 0.554 0.0802 2.555 -2.559 0.4656 1.707 2.756 -3.195 6.418 6.152 2.523 -4.504 0.1776 -4.348 -3.584 -1.464 -2.943 1.319 -3.506 3.17 0.931 -1.698 3.754 3.092 2.836 -3.668 -1.314 -0.836 4.598 0.1907 -1.62 1.014 -2.592 0.04144 1.686 -1.2705 -1.982 -0.9937 -0.1705 -1.353 -3.115 2.158 0.954 1.019 2.498 0.789 -3.646 -0.587 -1.215 -2.244 4.36 -0.386 -0.932 1.7295 -2.705 1.098 -0.8174 -0.376 1.177 2.041 3.816 0.6025 -1.906 -4.066 1.819 0.661 -0.01698 2.49 -4.617 2.05 -2.55 0.06018 -1.648 -0.5664 -0.7236 -1.316 -0.2761 2.826 2.803 -1.946 3.924 2.436 0.5146 2.684 -2.842 0.9697 0.2542 0.06165 0.946 5.746 -2.504 1.875 -0.642 0.1162 -3.084 -0.4434 0.4595 1.48 4.03 0.508 2.227 -2.742 1.286 0.5176 0.536 -1.572 -0.5054 -1.016 0.2267 1.752 -5.76 -6.465 4.65 -1.616 -2.016 -2.828 3.861 -0.363 0.7295 4.355 -1.871 -5.555 -4.863 1.598 1.055 -4. 2.012 2.002 3.045 -1.92 1.443 -0.5664 -0.649 -2.38 -2.988 -4.22 4.39 0.952 5.527 1.76 0.904 ] 32.64008193895124 32.64451828090898 -------------------------------------------------------------------------------- checking out_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.05690966323716567 x = [-0.0946 -0.0888 -0.07086 ... 0.08984 0.09125 -0.1554 ] y = [-0.08923 -0.0888 -0.07227 ... 0.0864 0.0899 -0.1519 ] 57.696158093248734 57.69888839105585 -------------------------------------------------------------------------------- checking int_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.06003701090812683 x = [ 0.002035 -0.2515 0.3367 0.1931 0.1311 -0.787 -0.1302 -0.0656 -0.662 0.698 -0.3108 0.1598 -0.2128 -0.1075 0.3403 -0.05444 -0.376 -0.235 -0.04965 -0.7715 1.018 0.1327 -0.0766 -0.3176 -0.0796 0.2164 -0.554 -0.2825 0.2744 -0.1746 -0.2065 -0.3186 -1.071 -0.2942 0.01956 0.574 0.203 -0.2905 0.9233 -0.275 -0.1725 -0.03397 -0.00835 -0.2952 -0.0791 0.3809 -0.2241 0.7393 -0.5234 -0.9453 -0.5303 -0.54 -0.284 -0.002916 0.00841 -0.6377 -0.1578 -0.1422 -0.2812 0.0676 -0.1307 -0.4133 -0.2041 0.681 0.4019 0.1814 0.12164 -0.009415 -0.1356 -0.7144 -0.324 0.8364 -0.04492 0.4194 -0.485 0.363 -0.5527 0.1973 0.2157 -0.3752 0.2554 -0.283 -0.3193 0.1705 -0.0733 -0.1547 0.238 0.4185 0.3345 -0.496 0.3838 0.001597 0.1016 -0.3115 0.2164 -0.05374 -0.3157 -0.07623 -0.2969 -0.07837 0.2642 -0.1346 0.2576 -0.45 -0.2107 -0.54 -0.1504 0.2269 0.007397 -0.1135 -0.4177 -0.02411 -0.4006 -0.2397 0.5854 -0.507 -0.382 -0.05396 0.353 -0.7217 -0.1135 -0.3828 0.08875 0.1395 0.07513 -0.592 0.42 0.2391 -0.01747 -0.2174 -0.3079 0.4526 0.3867 0.05557 0.5903 0.509 -0.348 -0.7817 0.3826 -0.0633 -0.383 -0.3848 0.00672 0.2172 0.0777 -0.841 -0.1682 0.01538 0.2979 -0.2747 0.2245 -0.02353 -0.0499 0.2224 0.2047 0.218 -0.0991 -0.2247 0.7383 0.536 ] y = [ 0.004356 -0.2507 0.3367 0.1929 0.1318 -0.787 -0.1315 -0.0645 -0.661 0.6997 -0.3098 0.1603 -0.212 -0.10693 0.341 -0.05347 -0.3762 -0.2365 -0.04895 -0.771 1.018 0.1337 -0.0772 -0.316 -0.07916 0.2178 -0.553 -0.2825 0.2754 -0.1754 -0.2058 -0.3203 -1.071 -0.2927 0.02052 0.5747 0.2035 -0.2886 0.9253 -0.2742 -0.1721 -0.03323 -0.0096 -0.2952 -0.0792 0.3806 -0.2255 0.7407 -0.522 -0.9434 -0.5303 -0.539 -0.2842 -0.00327 0.00788 -0.638 -0.1578 -0.1411 -0.2822 0.0686 -0.1304 -0.4136 -0.2034 0.684 0.404 0.1824 0.12054 -0.00902 -0.134 -0.715 -0.3232 0.8374 -0.0447 0.4211 -0.4844 0.3638 -0.553 0.1968 0.2173 -0.376 0.2563 -0.2844 -0.319 0.1693 -0.07117 -0.1538 0.2393 0.4197 0.334 -0.4946 0.3843 0.001861 0.10205 -0.3115 0.2186 -0.0551 -0.3162 -0.0756 -0.2964 -0.0791 0.2644 -0.1355 0.2595 -0.45 -0.2092 -0.5396 -0.1517 0.2272 0.007812 -0.1131 -0.4153 -0.02411 -0.3992 -0.2383 0.5864 -0.507 -0.3826 -0.05533 0.3525 -0.72 -0.11395 -0.3818 0.0896 0.14 0.0749 -0.591 0.4211 0.239 -0.01536 -0.2169 -0.3083 0.4531 0.3877 0.05576 0.591 0.5083 -0.3464 -0.7803 0.3848 -0.06174 -0.3816 -0.3848 0.007046 0.2188 0.0782 -0.8394 -0.1663 0.01513 0.2998 -0.2747 0.2255 -0.02345 -0.04883 0.2227 0.207 0.2185 -0.0983 -0.2247 0.738 0.5376 ] 4.791503061226778 4.792162256026435 -------------------------------------------------------------------------------- checking int_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.060138283515349035 x = [ 0.961 -0.183 -0.9346 ... 0.3984 -0.2776 0.672 ] y = [ 0.9355 -0.1846 -0.9346 ... 0.402 -0.2842 0.6655] 60.542489345309264 60.53889793565383 -------------------------------------------------------------------------------- checking N2_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.41788278102874754 x = [-1.472 -3.793 -0.7573 1.872 0.2445 0.9873 0.1015 -3.426 -4.12 -2.629 -1.587 2.139 2.143 -3.463 2.09 -0.698 4.15 -2.303 2.02 0.375 0.1853 2.662 -2.56 0.38 1.815 2.844 -2.893 6.555 6.207 2.43 -4.484 0.06046 -4.207 -3.617 -1.428 -2.861 1.282 -3.555 3.014 1.005 -1.652 3.867 3.148 2.904 -3.625 -1.247 -0.7754 4.69 0.1371 -1.448 1.188 -2.572 0.2317 1.725 -1.177 -1.966 -0.879 -0.02557 -1.28 -2.99 2.223 1.0625 1.1455 2.613 0.5146 -3.658 -0.6636 -1.044 -2.234 4.406 -0.4111 -1.052 1.693 -2.656 1.146 -0.756 -0.2883 1.257 2.182 3.828 0.598 -1.897 -4.098 1.885 0.6987 -0.1525 2.387 -4.484 2.125 -2.64 0.10803 -1.467 -0.6094 -0.518 -1.356 -0.2141 2.918 2.748 -2.094 4.043 2.328 0.5474 2.564 -2.97 0.912 0.4185 0.007366 0.7656 5.613 -2.488 1.975 -0.6465 0.1418 -3.088 -0.3315 0.494 1.576 3.957 0.558 2.352 -2.658 1.348 0.3516 0.553 -1.686 -0.4407 -1.104 0.2224 1.761 -5.72 -6.348 4.586 -1.573 -2.018 -2.953 3.79 -0.3691 0.766 4.453 -1.956 -5.605 -4.98 1.669 0.987 -3.943 2.076 2.018 2.996 -2.049 1.443 -0.5674 -0.798 -2.406 -3.008 -4.035 4.44 0.9316 5.453 1.838 0.8525 ] y = [-1.469 -3.783 -0.7515 1.876 0.2502 0.9893 0.09766 -3.426 -4.13 -2.629 -1.586 2.137 2.145 -3.467 2.09 -0.699 4.145 -2.303 2.018 0.3757 0.1868 2.664 -2.564 0.3884 1.817 2.848 -2.896 6.562 6.207 2.43 -4.49 0.0626 -4.21 -3.62 -1.427 -2.867 1.278 -3.553 3.021 1.001 -1.652 3.867 3.152 2.895 -3.617 -1.244 -0.7847 4.69 0.1351 -1.454 1.19 -2.57 0.2349 1.732 -1.177 -1.967 -0.884 -0.02484 -1.289 -2.988 2.229 1.059 1.1455 2.611 0.5117 -3.654 -0.6597 -1.043 -2.24 4.395 -0.4077 -1.054 1.7 -2.648 1.1455 -0.753 -0.2888 1.262 2.188 3.83 0.6025 -1.902 -4.098 1.889 0.7017 -0.1489 2.385 -4.49 2.135 -2.64 0.1145 -1.473 -0.604 -0.5205 -1.356 -0.2091 2.918 2.756 -2.1 4.043 2.33 0.552 2.564 -2.97 0.921 0.4211 0.00809 0.7676 5.61 -2.486 1.975 -0.647 0.1401 -3.086 -0.3315 0.4902 1.573 3.953 0.5605 2.352 -2.662 1.35 0.3574 0.5483 -1.687 -0.4292 -1.096 0.2278 1.767 -5.727 -6.348 4.58 -1.575 -2.021 -2.951 3.793 -0.3782 0.764 4.445 -1.96 -5.6 -4.98 1.668 0.987 -3.943 2.08 2.02 2.994 -2.045 1.445 -0.569 -0.8022 -2.41 -3.002 -4.03 4.44 0.927 5.457 1.837 0.848 ] 32.599820468805056 32.60427167088468 -------------------------------------------------------------------------------- checking N2_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.42777257919311523 x = [-2.715 4.137 3.588 -1.571 -0.4573 1.671 -4.184 -1.691 -3.387 -1.455 -1.743 0.706 -5.402 1.826 -3.762 0.66 -1.837 -1.296 3.8 -4.65 0.5957 0.1312 -0.718 -3.377 -1.1875 6.06 -3.26 4.992 6.04 1.624 -4.19 1.818 -0.0888 -2.396 -3.646 0.6255 0.28 0.03293 0.02208 -0.662 2.664 3.227 -2.922 0.627 -3.309 1.322 -0.2017 -0.692 -0.303 0.2467 -0.284 -0.1163 -4.68 -0.367 -2.197 0.3728 -0.602 -0.529 3.64 3.166 -3.406 -0.7446 -2.082 2.861 3.072 -2.645 1.004 5.285 -2.516 1.409 0.2527 -0.1304 4.008 2.438 5.12 0.644 3.148 -2.11 2.096 1.652 -1.787 -1.891 4.027 1.186 1.502 0.3015 -2.06 -4.652 1.8545 0.7695 1.912 -2.824 -0.2827 2.088 -0.3645 1.1455 -2.133 2.477 1.905 2.113 -1.753 -1.16 0.3928 3.969 -0.5415 2.672 1.898 10.23 -2.953 -2.842 -1.355 -2.709 -0.5776 -4.285 2.385 -0.7915 0.4348 -2.494 -2.281 2.258 4.465 -3.068 -0.6045 -4.496 -3.873 -1.46 1.515 0.768 0.352 4.332 2.717 0.3228 0.05807 4.258 0.383 -1.258 -3.379 1.518 -1.168 -3.072 0.2118 -0.828 1.776 0.909 3.848 1.35 1.508 -2.2 -4.49 0.6606 0.1896 -2.91 -0.5054 -1.554 -4.64 1.016 -2.52 -4.066 -3.746 4.594 ] y = [-2.69 4.15 3.602 -1.542 -0.4365 1.698 -4.168 -1.673 -3.342 -1.422 -1.723 0.717 -5.38 1.833 -3.75 0.6914 -1.812 -1.277 3.828 -4.62 0.614 0.1648 -0.687 -3.355 -1.172 6.066 -3.219 5.016 6.05 1.621 -4.168 1.844 -0.0697 -2.375 -3.635 0.6577 0.298 0.0513 0.04977 -0.6353 2.676 3.258 -2.902 0.6553 -3.275 1.337 -0.1838 -0.663 -0.2888 0.2651 -0.254 -0.09393 -4.656 -0.3528 -2.178 0.3882 -0.582 -0.505 3.66 3.182 -3.393 -0.7163 -2.06 2.885 3.098 -2.633 1.029 5.34 -2.486 1.432 0.2727 -0.1022 4.02 2.477 5.15 0.668 3.168 -2.096 2.125 1.669 -1.764 -1.856 4.047 1.207 1.501 0.32 -2.033 -4.652 1.882 0.7812 1.937 -2.793 -0.2532 2.111 -0.3323 1.178 -2.115 2.5 1.937 2.14 -1.735 -1.145 0.4224 3.994 -0.5195 2.7 1.908 10.266 -2.932 -2.809 -1.344 -2.682 -0.5425 -4.26 2.41 -0.7764 0.4521 -2.463 -2.273 2.268 4.484 -3.041 -0.5757 -4.47 -3.844 -1.439 1.541 0.7837 0.3918 4.35 2.732 0.3464 0.08856 4.297 0.4104 -1.239 -3.355 1.535 -1.13 -3.035 0.2181 -0.8247 1.791 0.931 3.865 1.37 1.538 -2.197 -4.473 0.683 0.2109 -2.877 -0.4788 -1.525 -4.605 1.035 -2.502 -4.055 -3.734 4.613 ] 33.93978242579727 33.938268418689546 -------------------------------------------------------------------------------- checking O_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4162674045562744 x = [-1.484 -3.799 -0.7715 1.852 0.2356 0.9746 0.0776 -3.441 -4.137 -2.635 -1.601 2.102 2.104 -3.475 2.055 -0.7256 4.1 -2.322 1.993 0.3577 0.1691 2.639 -2.582 0.3684 1.785 2.805 -2.9 6.504 6.176 2.406 -4.49 0.0375 -4.22 -3.627 -1.439 -2.875 1.263 -3.559 2.988 0.978 -1.665 3.842 3.115 2.86 -3.62 -1.265 -0.804 4.652 0.1131 -1.473 1.185 -2.598 0.2241 1.703 -1.189 -1.977 -0.8926 -0.0509 -1.29 -2.998 2.195 1.03 1.12 2.584 0.4893 -3.662 -0.68 -1.056 -2.252 4.367 -0.4304 -1.071 1.666 -2.654 1.112 -0.778 -0.3123 1.236 2.15 3.787 0.5815 -1.913 -4.094 1.853 0.678 -0.1693 2.354 -4.492 2.1 -2.654 0.1003 -1.4795 -0.6167 -0.5474 -1.372 -0.2323 2.879 2.727 -2.129 4.008 2.297 0.528 2.535 -2.979 0.8955 0.4055 -0.007362 0.744 5.57 -2.502 1.931 -0.6763 0.1106 -3.105 -0.3638 0.4712 1.548 3.916 0.5386 2.33 -2.66 1.314 0.338 0.5356 -1.713 -0.4487 -1.106 0.2062 1.738 -5.727 -6.344 4.55 -1.582 -2.027 -2.959 3.754 -0.3892 0.7373 4.418 -1.968 -5.605 -4.977 1.646 0.968 -3.941 2.053 1.992 2.969 -2.053 1.424 -0.583 -0.8276 -2.42 -3.014 -4.027 4.418 0.907 5.414 1.812 0.8164 ] y = [-1.4844e+00 -3.7930e+00 -7.7002e-01 1.8525e+00 2.3767e-01 9.7559e-01 7.7148e-02 -3.4375e+00 -4.1406e+00 -2.6367e+00 -1.6006e+00 2.0996e+00 2.1074e+00 -3.4766e+00 2.0508e+00 -7.2363e-01 4.1016e+00 -2.3203e+00 1.9941e+00 3.5571e-01 1.6602e-01 2.6367e+00 -2.5840e+00 3.7158e-01 1.7861e+00 2.8047e+00 -2.9004e+00 6.5039e+00 6.1680e+00 2.4023e+00 -4.4883e+00 3.8025e-02 -4.2188e+00 -3.6309e+00 -1.4336e+00 -2.8770e+00 1.2617e+00 -3.5586e+00 2.9922e+00 9.7656e-01 -1.6621e+00 3.8418e+00 3.1113e+00 2.8574e+00 -3.6172e+00 -1.2637e+00 -8.0176e-01 4.6484e+00 1.1133e-01 -1.4756e+00 1.1807e+00 -2.5918e+00 2.2522e-01 1.7061e+00 -1.1895e+00 -1.9785e+00 -8.9258e-01 -4.9225e-02 -1.2939e+00 -2.9961e+00 2.1973e+00 1.0273e+00 1.1211e+00 2.5859e+00 4.8975e-01 -3.6562e+00 -6.7627e-01 -1.0537e+00 -2.2520e+00 4.3633e+00 -4.2627e-01 -1.0723e+00 1.6670e+00 -2.6543e+00 1.1113e+00 -7.7393e-01 -3.1763e-01 1.2334e+00 2.1484e+00 3.7891e+00 5.8008e-01 -1.9180e+00 -4.0938e+00 1.8545e+00 6.7920e-01 -1.6541e-01 2.3477e+00 -4.4961e+00 2.0996e+00 -2.6523e+00 1.0187e-01 -1.4844e+00 -6.1475e-01 -5.4834e-01 -1.3721e+00 -2.3108e-01 2.8867e+00 2.7266e+00 -2.1309e+00 4.0039e+00 2.2969e+00 5.2686e-01 2.5371e+00 -2.9824e+00 8.9844e-01 4.0063e-01 -6.4621e-03 7.4658e-01 5.5703e+00 -2.5000e+00 1.9307e+00 -6.7188e-01 1.1096e-01 -3.1074e+00 -3.6450e-01 4.6704e-01 1.5430e+00 3.9141e+00 5.4102e-01 2.3281e+00 -2.6602e+00 1.3145e+00 3.4033e-01 5.3271e-01 -1.7100e+00 -4.4482e-01 -1.1045e+00 2.0837e-01 1.7412e+00 -5.7266e+00 -6.3398e+00 4.5469e+00 -1.5859e+00 -2.0312e+00 -2.9570e+00 3.7539e+00 -3.9453e-01 7.3584e-01 4.4141e+00 -1.9707e+00 -5.6016e+00 -4.9727e+00 1.6455e+00 9.6680e-01 -3.9414e+00 2.0566e+00 1.9902e+00 2.9648e+00 -2.0527e+00 1.4219e+00 -5.8447e-01 -8.3008e-01 -2.4219e+00 -3.0098e+00 -4.0273e+00 4.4180e+00 9.0381e-01 5.4141e+00 1.8135e+00 8.1494e-01] 32.48091641159688 32.47390062543133 -------------------------------------------------------------------------------- checking O_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.03193680250272155 x = [ 0.07684 -0.1573 0.1302 ... -0.3083 -0.2844 -0.1229 ] y = [ 0.0787 -0.1608 0.131 ... -0.3113 -0.285 -0.124 ] 32.469470205885045 32.46936014658948 -------------------------------------------------------------------------------- checking V_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.09918542444705963 x = [ 1.6152e+00 3.8184e-01 7.0459e-01 7.1826e-01 2.4890e-01 6.6455e-01 -5.7617e-01 1.1902e-01 -7.1228e-02 -2.5366e-01 6.5625e-01 -2.4792e-01 5.2393e-01 -3.2324e-01 -6.3037e-01 -1.0504e-01 -1.9482e-01 1.0674e+00 -2.4194e-01 -8.0371e-01 -5.4297e-01 -1.0420e+00 7.0850e-01 8.2715e-01 -4.1016e-01 -4.6509e-01 -4.0161e-01 1.9897e-01 -3.5791e-01 1.1807e+00 -2.0837e-01 1.3008e+00 -2.9956e-01 -3.5742e-01 6.6357e-01 -2.5269e-01 3.2739e-01 -9.4385e-01 1.5674e+00 4.7095e-01 -1.2061e-01 -1.5454e-01 9.2712e-02 6.9763e-02 -6.4404e-01 -3.9819e-01 7.1411e-02 -6.1816e-01 1.3657e-02 -2.0325e-01 -4.4995e-01 3.8916e-01 -3.9478e-01 -7.5977e-01 5.4395e-01 -1.4404e-01 6.4160e-01 5.0293e-01 -4.3677e-01 1.6626e-01 -1.3757e-01 -2.7637e-01 -1.7500e+00 -4.4385e-01 -8.8281e-01 2.9199e-01 2.8125e-01 -4.1138e-01 -8.7500e-01 3.4668e-01 6.7871e-01 9.0674e-01 5.1367e-01 -2.1558e-01 1.1533e+00 3.5034e-02 -1.1533e+00 -4.3884e-02 2.0801e-01 8.0957e-01 2.0911e-01 1.9336e-01 8.3057e-01 5.0928e-01 3.9819e-01 -1.0713e+00 -1.2390e-01 1.0468e-02 7.4890e-02 -5.0439e-01 -5.4297e-01 -3.1885e-01 -2.7295e-01 -2.4426e-01 -6.6711e-02 5.5811e-01 4.8242e-01 1.4441e-01 -7.2412e-01 7.7686e-01 -5.6982e-01 -9.1064e-01 -8.8167e-04 2.9297e-01 -4.5117e-01 -1.6514e+00 6.3818e-01 -4.6997e-01 1.1260e+00 -9.8975e-01 -3.5840e-01 4.0112e-01 -7.2754e-01 -6.9189e-01 -1.0488e+00 2.9980e-01 5.0586e-01 -7.8564e-01 6.4941e-01 -8.6768e-01 6.6833e-02 2.3059e-01 -5.3662e-01 2.3840e-01 -4.3396e-02 -1.1115e-01 6.5723e-01 6.8542e-02 2.6440e-01 -1.4619e+00 -1.1055e+00 4.9164e-02 -5.8105e-01 -1.9989e-02 -2.8296e-01 -3.3838e-01 -4.1089e-01 4.9121e-01 -1.9507e-01 3.1372e-01 -5.5469e-01 -1.5215e+00 -5.7666e-01 -5.7678e-02 -3.7671e-01 -1.0820e+00 2.7368e-01 2.3828e-01 -2.2351e-01 -1.6431e-01 6.6455e-01 -2.8961e-02 5.4590e-01 1.4624e-01 -3.9111e-01 3.1909e-01 -1.4404e-01 -1.0479e+00 -3.1665e-01 -8.0859e-01] y = [ 1.612 0.3816 0.7036 0.7173 0.2515 0.664 -0.5776 0.12024 -0.0702 -0.2537 0.6562 -0.2491 0.5225 -0.32 -0.631 -0.1032 -0.1947 1.065 -0.24 -0.8027 -0.5415 -1.041 0.7114 0.8267 -0.41 -0.4639 -0.401 0.1974 -0.3574 1.184 -0.2069 1.297 -0.298 -0.3582 0.664 -0.2527 0.3252 -0.9443 1.566 0.4712 -0.1198 -0.1554 0.092 0.07007 -0.645 -0.399 0.0715 -0.6187 0.01377 -0.2058 -0.4497 0.3882 -0.3965 -0.76 0.543 -0.1434 0.6426 0.5044 -0.4377 0.1666 -0.1392 -0.279 -1.75 -0.443 -0.8813 0.292 0.2827 -0.4102 -0.8735 0.3462 0.678 0.9053 0.513 -0.2141 1.154 0.0341 -1.153 -0.04422 0.2076 0.8096 0.2112 0.1926 0.8286 0.508 0.395 -1.068 -0.125 0.010994 0.0765 -0.5034 -0.545 -0.319 -0.2732 -0.2456 -0.0673 0.5557 0.484 0.1451 -0.7256 0.7764 -0.5684 -0.912 -0.002888 0.2942 -0.451 -1.653 0.6387 -0.473 1.125 -0.9883 -0.3604 0.4026 -0.726 -0.6943 -1.049 0.2974 0.5063 -0.7847 0.6484 -0.866 0.06537 0.2308 -0.536 0.2378 -0.04306 -0.11206 0.6543 0.06604 0.2625 -1.458 -1.106 0.04968 -0.582 -0.01915 -0.2827 -0.338 -0.4097 0.49 -0.1964 0.3132 -0.5527 -1.52 -0.5757 -0.05838 -0.3762 -1.081 0.272 0.2408 -0.2191 -0.1632 0.666 -0.02887 0.545 0.146 -0.3892 0.317 -0.1427 -1.048 -0.317 -0.81 ] 7.8395881140089205 7.835368936919 -------------------------------------------------------------------------------- checking V_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.03173801538348198 x = [-2.593e-01 2.377e-01 9.323e-03 ... -4.236e-01 -2.506e-04 8.746e-02] y = [-0.2598 0.2346 0.00894 ... -0.4146 -0.002674 0.0857 ] 32.4214166561096 32.44496165346104 -------------------------------------------------------------------------------- checking hidden_state : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.0047469472671218685 x = [-0.0127 0.01668 0.0356 ... 0.002333 -0.0005894 0.02908 ] y = [-0.01268 0.01665 0.0356 ... 0.002333 -0.000606 0.0291 ] 34.065388389420555 34.06563833164641 -------------------------------------------------------------------------------- checking norm_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4167802661657334 x = [-1.187e+00 -3.078e+00 -8.638e-01 1.864e+00 4.172e-01 1.143e+00 -7.339e-01 -3.779e+00 -3.812e+00 -2.850e+00 -1.674e+00 2.514e+00 1.440e+00 -3.545e+00 2.434e+00 -1.256e+00 3.805e+00 -2.498e+00 1.431e+00 4.306e-04 2.847e-01 3.145e+00 -2.225e+00 2.467e-02 2.127e+00 3.459e+00 -2.541e+00 6.391e+00 5.887e+00 2.062e+00 -4.262e+00 -4.707e-01 -3.281e+00 -3.529e+00 -1.235e+00 -3.221e+00 1.310e+00 -3.627e+00 2.963e+00 1.045e+00 -1.728e+00 3.812e+00 3.301e+00 2.762e+00 -4.387e+00 -1.475e+00 -6.699e-01 4.309e+00 3.748e-01 -9.268e-01 1.456e+00 -2.652e+00 5.518e-01 1.364e+00 -1.160e+00 -2.277e+00 -7.734e-01 -3.533e-01 -1.517e+00 -2.273e+00 2.307e+00 7.666e-01 7.422e-01 3.041e+00 2.683e-01 -3.732e+00 -7.451e-01 -7.451e-01 -1.957e+00 4.586e+00 -6.782e-01 -8.452e-01 1.374e+00 -2.400e+00 9.888e-01 -8.203e-01 -5.312e-01 1.260e+00 2.031e+00 3.875e+00 6.831e-01 -2.375e+00 -4.254e+00 1.904e+00 1.136e+00 5.878e-02 2.949e+00 -4.477e+00 2.299e+00 -3.424e+00 2.206e-01 -1.505e+00 -1.076e+00 -7.651e-01 -1.523e+00 -5.962e-01 3.365e+00 2.572e+00 -2.324e+00 3.586e+00 1.718e+00 7.139e-01 1.639e+00 -2.822e+00 8.145e-01 5.869e-01 -1.395e-01 9.116e-01 5.723e+00 -2.338e+00 2.412e+00 -8.364e-01 -2.874e-01 -3.068e+00 -2.041e-01 1.190e-01 1.692e+00 3.969e+00 2.487e-01 2.840e+00 -2.303e+00 1.231e+00 -3.525e-02 4.617e-02 -8.853e-01 -5.225e-01 -9.878e-01 1.989e-01 1.215e+00 -5.766e+00 -6.039e+00 4.820e+00 -1.829e+00 -1.732e+00 -2.754e+00 4.645e+00 -2.974e-01 6.211e-01 4.516e+00 -2.607e+00 -5.645e+00 -4.594e+00 1.458e+00 1.291e+00 -3.812e+00 2.301e+00 2.219e+00 2.514e+00 -2.205e+00 7.871e-01 -8.003e-01 -9.146e-01 -1.741e+00 -3.121e+00 -3.582e+00 4.582e+00 6.548e-01 5.281e+00 2.434e+00 1.031e+00] y = [-1.1875e+00 -3.0703e+00 -8.5596e-01 1.8594e+00 4.1968e-01 1.1436e+00 -7.3486e-01 -3.7773e+00 -3.8125e+00 -2.8555e+00 -1.6729e+00 2.5117e+00 1.4424e+00 -3.5371e+00 2.4277e+00 -1.2588e+00 3.8066e+00 -2.5000e+00 1.4287e+00 2.9926e-03 2.7490e-01 3.1426e+00 -2.2246e+00 2.7756e-02 2.1328e+00 3.4609e+00 -2.5410e+00 6.3906e+00 5.8789e+00 2.0605e+00 -4.2617e+00 -4.6875e-01 -3.2832e+00 -3.5312e+00 -1.2305e+00 -3.2246e+00 1.3086e+00 -3.6270e+00 2.9648e+00 1.0449e+00 -1.7236e+00 3.8086e+00 3.2949e+00 2.7617e+00 -4.3867e+00 -1.4775e+00 -6.6602e-01 4.3047e+00 3.7769e-01 -9.3115e-01 1.4531e+00 -2.6445e+00 5.5664e-01 1.3682e+00 -1.1650e+00 -2.2734e+00 -7.7637e-01 -3.5645e-01 -1.5146e+00 -2.2656e+00 2.3086e+00 7.6172e-01 7.3975e-01 3.0410e+00 2.6807e-01 -3.7266e+00 -7.4219e-01 -7.3828e-01 -1.9551e+00 4.5820e+00 -6.7529e-01 -8.4424e-01 1.3799e+00 -2.4004e+00 9.9609e-01 -8.1641e-01 -5.3955e-01 1.2529e+00 2.0332e+00 3.8770e+00 6.8506e-01 -2.3828e+00 -4.2539e+00 1.9072e+00 1.1377e+00 6.5430e-02 2.9473e+00 -4.4766e+00 2.2988e+00 -3.4199e+00 2.2717e-01 -1.5059e+00 -1.0752e+00 -7.6562e-01 -1.5234e+00 -5.9570e-01 3.3711e+00 2.5703e+00 -2.3223e+00 3.5840e+00 1.7207e+00 7.1387e-01 1.6436e+00 -2.8262e+00 8.2080e-01 5.8252e-01 -1.3647e-01 9.1260e-01 5.7266e+00 -2.3379e+00 2.4160e+00 -8.3447e-01 -2.8735e-01 -3.0723e+00 -2.0471e-01 1.1658e-01 1.6865e+00 3.9609e+00 2.5098e-01 2.8359e+00 -2.3027e+00 1.2236e+00 -3.3691e-02 3.8818e-02 -8.8135e-01 -5.2002e-01 -9.8389e-01 1.9824e-01 1.2188e+00 -5.7656e+00 -6.0430e+00 4.8203e+00 -1.8330e+00 -1.7354e+00 -2.7500e+00 4.6406e+00 -3.0518e-01 6.2158e-01 4.5078e+00 -2.6094e+00 -5.6406e+00 -4.5859e+00 1.4551e+00 1.2881e+00 -3.8125e+00 2.3086e+00 2.2207e+00 2.5078e+00 -2.2031e+00 7.8516e-01 -8.0176e-01 -9.1602e-01 -1.7393e+00 -3.1172e+00 -3.5801e+00 4.5820e+00 6.5186e-01 5.2773e+00 2.4375e+00 1.0264e+00] 32.443772890929615 32.434425335855806 -------------------------------------------------------------------------------- checking norm_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.41970342636108404 x = [-2.5000e+00 4.1406e+00 3.5352e+00 -1.6670e+00 -1.8262e-01 1.8623e+00 -3.9219e+00 -1.4141e+00 -3.3809e+00 -1.1211e+00 -1.7246e+00 7.2705e-01 -5.3711e+00 1.5400e+00 -3.5762e+00 9.0771e-01 -1.8262e+00 -1.1387e+00 3.5176e+00 -4.6445e+00 4.8657e-01 3.7817e-01 -4.0332e-01 -3.2207e+00 -1.3232e+00 6.1016e+00 -3.1836e+00 4.7852e+00 5.7734e+00 1.4785e+00 -4.0859e+00 1.9209e+00 -4.6417e-02 -2.1211e+00 -3.7246e+00 9.1406e-01 7.4280e-02 -3.7750e-02 -1.8567e-01 -6.6602e-01 2.7227e+00 3.2598e+00 -3.1426e+00 7.4414e-01 -2.8359e+00 1.0879e+00 -4.6387e-01 -1.0176e+00 -4.3652e-01 2.2620e-01 -1.3306e-01 4.3854e-02 -4.8281e+00 -3.5474e-01 -2.1348e+00 2.5415e-01 -5.3271e-01 -5.7861e-01 3.5801e+00 3.3906e+00 -3.0293e+00 -5.1758e-01 -2.0469e+00 2.7207e+00 3.5469e+00 -2.4844e+00 8.9111e-01 5.0664e+00 -2.4336e+00 1.3281e+00 3.5840e-01 -9.2621e-03 3.7012e+00 2.6562e+00 5.1875e+00 3.2349e-01 3.2090e+00 -2.0312e+00 2.0625e+00 1.6973e+00 -1.8125e+00 -2.0332e+00 3.7734e+00 1.1299e+00 1.5996e+00 -3.9948e-02 -1.9404e+00 -4.6445e+00 1.8447e+00 7.2705e-01 1.9941e+00 -2.6621e+00 -3.4033e-01 1.9824e+00 -4.3091e-01 1.5908e+00 -2.1641e+00 2.1758e+00 1.7314e+00 2.2285e+00 -1.8672e+00 -9.6826e-01 6.6016e-01 4.1211e+00 -5.1611e-01 2.8418e+00 1.9434e+00 9.9766e+00 -3.0723e+00 -2.6836e+00 -1.4873e+00 -2.6602e+00 -5.3662e-01 -3.6074e+00 2.1523e+00 -1.0703e+00 4.0479e-01 -2.2734e+00 -2.2012e+00 2.3164e+00 4.2031e+00 -2.9219e+00 -6.3721e-01 -4.4062e+00 -3.6953e+00 -1.4590e+00 1.8164e+00 8.3643e-01 -3.7903e-02 4.3945e+00 2.7090e+00 8.0566e-02 -3.3783e-02 4.4375e+00 3.7134e-01 -9.8242e-01 -3.2344e+00 1.6865e+00 -1.1885e+00 -2.7637e+00 -1.0022e-01 -1.1797e+00 2.1914e+00 8.1982e-01 3.7422e+00 1.3799e+00 1.6299e+00 -2.1113e+00 -4.5781e+00 8.1787e-01 5.7343e-02 -3.2910e+00 -5.0391e-01 -1.4609e+00 -3.9766e+00 8.8574e-01 -2.5566e+00 -3.7930e+00 -3.5078e+00 4.3984e+00] y = [-2.4980e+00 4.1250e+00 3.5273e+00 -1.6611e+00 -1.8530e-01 1.8555e+00 -3.9238e+00 -1.4170e+00 -3.3594e+00 -1.1133e+00 -1.7207e+00 7.1582e-01 -5.3672e+00 1.5293e+00 -3.5781e+00 9.1064e-01 -1.8350e+00 -1.1455e+00 3.5312e+00 -4.6406e+00 4.8657e-01 3.8159e-01 -3.9771e-01 -3.2227e+00 -1.3252e+00 6.0977e+00 -3.1660e+00 4.7891e+00 5.7656e+00 1.4639e+00 -4.0859e+00 1.9277e+00 -4.7119e-02 -2.1250e+00 -3.7344e+00 9.1797e-01 7.1716e-02 -4.3701e-02 -1.7871e-01 -6.6846e-01 2.7148e+00 3.2695e+00 -3.1543e+00 7.5049e-01 -2.8281e+00 1.0820e+00 -4.7363e-01 -1.0264e+00 -4.3433e-01 2.2693e-01 -1.2512e-01 4.8126e-02 -4.8242e+00 -3.6328e-01 -2.1387e+00 2.4756e-01 -5.2539e-01 -5.8008e-01 3.5742e+00 3.3945e+00 -3.0352e+00 -5.1709e-01 -2.0508e+00 2.7227e+00 3.5645e+00 -2.4922e+00 8.9551e-01 5.0898e+00 -2.4238e+00 1.3271e+00 3.5425e-01 3.5763e-03 3.6855e+00 2.6621e+00 5.2031e+00 3.2129e-01 3.2070e+00 -2.0352e+00 2.0645e+00 1.6914e+00 -1.8105e+00 -2.0254e+00 3.7715e+00 1.1367e+00 1.5781e+00 -3.6194e-02 -1.9355e+00 -4.6641e+00 1.8418e+00 7.1924e-01 1.9941e+00 -2.6543e+00 -3.3252e-01 1.9824e+00 -4.1919e-01 1.5918e+00 -2.1777e+00 2.1797e+00 1.7402e+00 2.2246e+00 -1.8701e+00 -9.7559e-01 6.5674e-01 4.1211e+00 -5.1367e-01 2.8535e+00 1.9385e+00 9.9922e+00 -3.0801e+00 -2.6719e+00 -1.5098e+00 -2.6562e+00 -5.2588e-01 -3.6035e+00 2.1602e+00 -1.0713e+00 4.0186e-01 -2.2793e+00 -2.2051e+00 2.3145e+00 4.1992e+00 -2.9141e+00 -6.2402e-01 -4.4023e+00 -3.6836e+00 -1.4531e+00 1.8193e+00 8.2422e-01 -3.2501e-02 4.3906e+00 2.7012e+00 8.1909e-02 -2.8030e-02 4.4492e+00 3.7622e-01 -9.8145e-01 -3.2305e+00 1.6865e+00 -1.1826e+00 -2.7500e+00 -1.1414e-01 -1.1904e+00 2.1797e+00 8.2031e-01 3.7363e+00 1.3770e+00 1.6416e+00 -2.1270e+00 -4.5742e+00 8.1885e-01 5.6335e-02 -3.2793e+00 -4.9756e-01 -1.4502e+00 -3.9688e+00 8.8232e-01 -2.5645e+00 -3.8008e+00 -3.5176e+00 4.3984e+00] 33.26895718280028 33.27212421390332 -------------------------------------------------------------------------------- checking out_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.40883615493774417 x = [-1.15 -3.023 -0.85 1.852 0.3843 1.107 -0.7236 -3.705 -3.756 -2.791 -1.646 2.447 1.403 -3.479 2.387 -1.23 3.729 -2.459 1.405 0.01991 0.2708 3.082 -2.188 0.02231 2.082 3.398 -2.514 6.266 5.758 2.02 -4.188 -0.461 -3.229 -3.451 -1.212 -3.146 1.297 -3.572 2.904 1.013 -1.699 3.73 3.254 2.705 -4.32 -1.462 -0.6675 4.215 0.3606 -0.9204 1.422 -2.594 0.5283 1.325 -1.1455 -2.236 -0.7637 -0.3384 -1.486 -2.227 2.254 0.743 0.7207 2.98 0.2642 -3.646 -0.725 -0.751 -1.909 4.496 -0.665 -0.825 1.371 -2.348 0.9814 -0.7935 -0.5156 1.246 1.987 3.795 0.6797 -2.318 -4.17 1.863 1.096 0.0635 2.875 -4.395 2.256 -3.37 0.1987 -1.486 -1.059 -0.741 -1.504 -0.584 3.299 2.508 -2.271 3.531 1.684 0.694 1.599 -2.781 0.8022 0.573 -0.1431 0.88 5.62 -2.297 2.357 -0.8022 -0.2874 -3.004 -0.215 0.11707 1.667 3.883 0.244 2.791 -2.271 1.211 -0.0319 0.0324 -0.855 -0.5156 -0.965 0.1996 1.19 -5.65 -5.934 4.746 -1.801 -1.698 -2.705 4.55 -0.295 0.5947 4.426 -2.555 -5.55 -4.51 1.419 1.266 -3.748 2.246 2.164 2.473 -2.168 0.784 -0.7954 -0.8867 -1.717 -3.084 -3.523 4.516 0.6445 5.2 2.389 1.01 ] y = [-1.152 -3.016 -0.8457 1.85 0.3877 1.109 -0.726 -3.707 -3.758 -2.799 -1.642 2.441 1.407 -3.473 2.379 -1.235 3.732 -2.459 1.404 0.02034 0.2603 3.084 -2.19 0.02908 2.086 3.404 -2.512 6.266 5.754 2.02 -4.188 -0.459 -3.229 -3.455 -1.207 -3.152 1.296 -3.57 2.906 1.013 -1.695 3.73 3.25 2.707 -4.316 -1.465 -0.664 4.21 0.363 -0.9263 1.418 -2.59 0.5347 1.328 -1.154 -2.236 -0.763 -0.3403 -1.487 -2.223 2.258 0.742 0.718 2.982 0.263 -3.646 -0.7227 -0.7427 -1.91 4.49 -0.663 -0.828 1.376 -2.35 0.9873 -0.792 -0.525 1.239 1.987 3.799 0.681 -2.328 -4.176 1.865 1.099 0.0705 2.873 -4.39 2.256 -3.371 0.204 -1.485 -1.056 -0.7437 -1.504 -0.5854 3.305 2.51 -2.273 3.527 1.685 0.693 1.602 -2.785 0.8076 0.569 -0.1377 0.8813 5.625 -2.297 2.363 -0.8013 -0.2876 -3.01 -0.2151 0.1124 1.656 3.877 0.244 2.787 -2.275 1.2 -0.02858 0.02516 -0.849 -0.5146 -0.9624 0.2009 1.192 -5.652 -5.94 4.746 -1.804 -1.701 -2.705 4.547 -0.3027 0.597 4.414 -2.557 -5.55 -4.504 1.416 1.263 -3.748 2.254 2.164 2.465 -2.164 0.7803 -0.7974 -0.8877 -1.72 -3.084 -3.525 4.516 0.6396 5.203 2.393 1.004 ] 31.835674936613813 31.83609010864245 -------------------------------------------------------------------------------- checking out_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.05472901145974174 x = [-0.03403 -0.09045 -0.005608 ... 0.145 0.1098 -0.1343 ] y = [-0.03174 -0.0887 -0.0068 ... 0.1405 0.1103 -0.1335 ] 55.42179343894478 55.418071384243376 -------------------------------------------------------------------------------- checking int_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.05928250312805176 x = [ 0.1028 -0.2603 0.258 0.1741 0.1754 -0.7676 -0.0672 -0.02275 -0.7188 0.7075 -0.3713 0.1272 -0.1615 -0.1159 0.3528 -0.0967 -0.3567 -0.2239 -0.0988 -0.7515 0.9385 0.0617 -0.1117 -0.285 -0.1312 0.228 -0.4814 -0.245 0.4534 -0.1454 -0.2554 -0.2358 -1.081 -0.3225 0.04288 0.5386 0.2192 -0.276 0.9004 -0.3762 -0.2314 -0.01206 -0.06052 -0.2825 0.01526 0.306 -0.1825 0.7734 -0.4888 -0.865 -0.5356 -0.4954 -0.3364 0.05328 -0.02266 -0.731 -0.2262 -0.2222 -0.1975 0.1346 -0.0982 -0.348 -0.2439 0.695 0.3108 0.2124 0.1553 0.02013 -0.1462 -0.6616 -0.2405 0.828 -0.01004 0.3958 -0.4194 0.286 -0.549 0.2917 0.1952 -0.3074 0.2472 -0.222 -0.2634 0.222 -0.1566 -0.1171 0.2185 0.355 0.287 -0.4836 0.3591 -0.005424 0.08264 -0.2993 0.1942 -0.0763 -0.2803 -0.04727 -0.3623 -0.1365 0.237 -0.1141 0.243 -0.45 -0.2222 -0.5596 -0.1536 0.2152 0.02403 -0.0886 -0.3906 -0.04373 -0.3953 -0.2346 0.63 -0.4382 -0.3306 -0.1344 0.3113 -0.765 -0.08673 -0.378 0.1655 0.1843 -0.006615 -0.576 0.4973 0.266 -0.02518 -0.2715 -0.396 0.4758 0.4155 0.05954 0.4802 0.5156 -0.3398 -0.7563 0.387 -0.0703 -0.2832 -0.3433 0.04776 0.1758 0.0541 -0.798 -0.1191 0.0886 0.269 -0.2109 0.2438 -0.007133 -0.01525 0.2421 0.253 0.1855 -0.05383 -0.2297 0.6816 0.5474 ] y = [ 0.1011 -0.26 0.258 0.1743 0.1755 -0.768 -0.0704 -0.02217 -0.719 0.7075 -0.3728 0.1274 -0.1624 -0.1164 0.3547 -0.0962 -0.355 -0.2258 -0.09845 -0.753 0.9385 0.06146 -0.1121 -0.2844 -0.1299 0.2294 -0.4824 -0.2441 0.4531 -0.147 -0.2559 -0.2366 -1.08 -0.3225 0.04254 0.5386 0.2185 -0.2747 0.9014 -0.377 -0.2316 -0.013115 -0.06134 -0.2834 0.01463 0.3052 -0.1844 0.774 -0.4893 -0.8643 -0.536 -0.4968 -0.3367 0.05374 -0.02484 -0.732 -0.2261 -0.2229 -0.1981 0.1322 -0.0982 -0.3486 -0.2446 0.6963 0.3118 0.213 0.155 0.02032 -0.1455 -0.661 -0.24 0.829 -0.010544 0.3955 -0.42 0.2842 -0.5483 0.2927 0.1958 -0.3086 0.2465 -0.2233 -0.2634 0.2218 -0.1572 -0.11597 0.2181 0.3557 0.2861 -0.4844 0.3604 -0.00718 0.0839 -0.2993 0.1974 -0.0763 -0.2812 -0.0469 -0.3635 -0.136 0.2368 -0.115 0.2441 -0.4512 -0.2233 -0.56 -0.1542 0.2151 0.02538 -0.08887 -0.3918 -0.04294 -0.3926 -0.2323 0.631 -0.4387 -0.33 -0.1329 0.31 -0.764 -0.0869 -0.378 0.1661 0.1852 -0.00737 -0.5767 0.4973 0.2656 -0.02518 -0.2712 -0.397 0.4744 0.4153 0.05823 0.4805 0.5166 -0.3413 -0.7573 0.3867 -0.0693 -0.2827 -0.343 0.04712 0.1755 0.05475 -0.797 -0.12085 0.08923 0.2693 -0.2114 0.2428 -0.00825 -0.01507 0.2405 0.2546 0.1843 -0.05377 -0.2308 0.681 0.5474 ] 4.6972147077660305 4.699324738150734 -------------------------------------------------------------------------------- checking int_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.05772237277077511 x = [ 0.8657 -0.1836 -0.8804 ... 0.418 -0.2566 0.576 ] y = [ 0.847 -0.1833 -0.8726 ... 0.4163 -0.2546 0.583 ] 58.07821475995028 58.067305176790335 -------------------------------------------------------------------------------- checking N2_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4090537786483765 x = [-1.1074e+00 -2.8301e+00 -8.0029e-01 1.8340e+00 4.1797e-01 1.1006e+00 -7.5781e-01 -3.8340e+00 -3.6680e+00 -2.7168e+00 -1.5576e+00 2.4883e+00 1.4375e+00 -3.4551e+00 2.2949e+00 -1.2334e+00 3.7676e+00 -2.2598e+00 1.4551e+00 -1.4917e-01 3.5645e-01 3.1836e+00 -2.1719e+00 -7.8308e-02 2.1816e+00 3.4980e+00 -2.2109e+00 6.4023e+00 5.7969e+00 1.9219e+00 -4.1797e+00 -5.9424e-01 -3.0938e+00 -3.4512e+00 -1.2021e+00 -3.0586e+00 1.2471e+00 -3.6250e+00 2.7500e+00 1.0742e+00 -1.6611e+00 3.8652e+00 3.3027e+00 2.7598e+00 -4.2656e+00 -1.3818e+00 -6.1572e-01 4.3008e+00 3.1519e-01 -7.5928e-01 1.5850e+00 -2.5996e+00 7.2314e-01 1.3809e+00 -1.0654e+00 -2.2148e+00 -6.3721e-01 -1.8921e-01 -1.4229e+00 -2.0977e+00 2.3242e+00 8.3105e-01 8.5400e-01 3.0742e+00 4.8332e-03 -3.6602e+00 -7.9443e-01 -5.8740e-01 -1.8994e+00 4.5273e+00 -6.8506e-01 -9.5117e-01 1.3418e+00 -2.3086e+00 1.0215e+00 -7.2314e-01 -4.3481e-01 1.3330e+00 2.1367e+00 3.8047e+00 6.6748e-01 -2.3145e+00 -4.1836e+00 1.9404e+00 1.1357e+00 -5.9662e-02 2.7773e+00 -4.2500e+00 2.3184e+00 -3.4629e+00 2.4963e-01 -1.3252e+00 -1.0908e+00 -5.2734e-01 -1.5283e+00 -5.1367e-01 3.4023e+00 2.4668e+00 -2.4453e+00 3.6641e+00 1.5703e+00 7.2998e-01 1.4678e+00 -2.9004e+00 7.1924e-01 7.2559e-01 -1.5833e-01 6.7041e-01 5.4961e+00 -2.2812e+00 2.4648e+00 -8.2520e-01 -2.8564e-01 -2.9961e+00 -1.2646e-01 1.4844e-01 1.7520e+00 3.8145e+00 2.9932e-01 2.9023e+00 -2.1875e+00 1.2549e+00 -2.0642e-01 3.8330e-02 -9.4922e-01 -4.2285e-01 -1.0361e+00 2.1436e-01 1.2070e+00 -5.6523e+00 -5.8242e+00 4.6680e+00 -1.7695e+00 -1.6953e+00 -2.8340e+00 4.5000e+00 -3.1519e-01 6.2012e-01 4.5000e+00 -2.6348e+00 -5.5898e+00 -4.6484e+00 1.4961e+00 1.2041e+00 -3.7168e+00 2.3105e+00 2.1641e+00 2.4297e+00 -2.2852e+00 7.9297e-01 -7.8760e-01 -1.0322e+00 -1.7607e+00 -3.0957e+00 -3.3320e+00 4.5781e+00 6.3379e-01 5.1172e+00 2.4414e+00 9.5508e-01] y = [-1.108e+00 -2.822e+00 -7.964e-01 1.833e+00 4.226e-01 1.104e+00 -7.607e-01 -3.836e+00 -3.672e+00 -2.725e+00 -1.552e+00 2.484e+00 1.438e+00 -3.449e+00 2.289e+00 -1.238e+00 3.770e+00 -2.260e+00 1.455e+00 -1.503e-01 3.464e-01 3.186e+00 -2.176e+00 -7.074e-02 2.186e+00 3.504e+00 -2.211e+00 6.406e+00 5.793e+00 1.921e+00 -4.180e+00 -5.918e-01 -3.096e+00 -3.455e+00 -1.199e+00 -3.066e+00 1.245e+00 -3.623e+00 2.752e+00 1.075e+00 -1.658e+00 3.863e+00 3.299e+00 2.760e+00 -4.262e+00 -1.384e+00 -6.123e-01 4.297e+00 3.181e-01 -7.656e-01 1.582e+00 -2.594e+00 7.305e-01 1.385e+00 -1.074e+00 -2.215e+00 -6.357e-01 -1.907e-01 -1.425e+00 -2.094e+00 2.328e+00 8.306e-01 8.511e-01 3.076e+00 4.051e-03 -3.658e+00 -7.930e-01 -5.796e-01 -1.900e+00 4.520e+00 -6.821e-01 -9.541e-01 1.347e+00 -2.309e+00 1.026e+00 -7.212e-01 -4.441e-01 1.327e+00 2.137e+00 3.807e+00 6.689e-01 -2.322e+00 -4.184e+00 1.943e+00 1.139e+00 -5.191e-02 2.775e+00 -4.246e+00 2.318e+00 -3.463e+00 2.551e-01 -1.324e+00 -1.088e+00 -5.288e-01 -1.527e+00 -5.151e-01 3.408e+00 2.469e+00 -2.445e+00 3.660e+00 1.570e+00 7.285e-01 1.471e+00 -2.904e+00 7.261e-01 7.207e-01 -1.528e-01 6.714e-01 5.500e+00 -2.281e+00 2.471e+00 -8.252e-01 -2.866e-01 -3.002e+00 -1.259e-01 1.429e-01 1.741e+00 3.807e+00 3.003e-01 2.896e+00 -2.189e+00 1.245e+00 -2.039e-01 3.064e-02 -9.434e-01 -4.216e-01 -1.033e+00 2.150e-01 1.210e+00 -5.656e+00 -5.832e+00 4.668e+00 -1.771e+00 -1.699e+00 -2.834e+00 4.496e+00 -3.228e-01 6.230e-01 4.492e+00 -2.639e+00 -5.586e+00 -4.641e+00 1.490e+00 1.201e+00 -3.715e+00 2.316e+00 2.164e+00 2.422e+00 -2.283e+00 7.876e-01 -7.900e-01 -1.034e+00 -1.763e+00 -3.096e+00 -3.334e+00 4.578e+00 6.294e-01 5.117e+00 2.445e+00 9.478e-01] 31.86250959373151 31.860904300551045 -------------------------------------------------------------------------------- checking N2_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4163241291046143 x = [-2.4648e+00 3.8926e+00 3.4863e+00 -1.6045e+00 -7.8506e-03 1.9814e+00 -3.8047e+00 -1.2422e+00 -3.4355e+00 -1.0898e+00 -1.7256e+00 6.5283e-01 -5.3789e+00 1.3213e+00 -3.3418e+00 1.0391e+00 -1.8955e+00 -1.1191e+00 3.3516e+00 -4.6562e+00 5.0244e-01 6.0693e-01 -2.8394e-01 -3.0703e+00 -1.4824e+00 6.0977e+00 -3.0820e+00 4.5078e+00 5.5312e+00 1.4492e+00 -4.0977e+00 1.8330e+00 -8.7402e-02 -2.0332e+00 -3.7285e+00 1.0791e+00 -3.2227e-02 -8.4351e-02 -4.0503e-01 -8.8037e-01 2.8438e+00 3.2832e+00 -3.1934e+00 7.8027e-01 -2.7402e+00 1.0986e+00 -5.7373e-01 -1.2227e+00 -4.7144e-01 2.8467e-01 -9.2316e-03 1.2537e-01 -4.8828e+00 -4.4873e-01 -1.9697e+00 2.9907e-01 -6.3770e-01 -6.4111e-01 3.4805e+00 3.5000e+00 -2.6211e+00 -1.9751e-01 -2.1895e+00 2.5605e+00 3.7520e+00 -2.4590e+00 8.2568e-01 4.8984e+00 -2.3555e+00 1.2861e+00 3.2056e-01 -6.1859e-02 3.6738e+00 2.6992e+00 5.1797e+00 1.9373e-01 3.2305e+00 -1.9004e+00 1.9238e+00 1.5957e+00 -1.9336e+00 -2.3203e+00 3.6621e+00 1.1621e+00 1.6445e+00 -2.1838e-01 -1.8896e+00 -4.5195e+00 1.8486e+00 6.9287e-01 2.0020e+00 -2.5156e+00 -4.2505e-01 1.9141e+00 -4.6729e-01 1.8984e+00 -2.3184e+00 2.0625e+00 1.6787e+00 2.2383e+00 -1.8262e+00 -9.3066e-01 6.9482e-01 4.1797e+00 -4.4580e-01 2.8633e+00 2.0684e+00 9.7891e+00 -3.1953e+00 -2.5078e+00 -1.3955e+00 -2.7227e+00 -5.1709e-01 -3.3125e+00 2.1191e+00 -1.2500e+00 3.0542e-01 -2.1348e+00 -2.2422e+00 2.1816e+00 4.1250e+00 -2.8965e+00 -6.5137e-01 -4.3906e+00 -3.5840e+00 -1.3535e+00 1.9268e+00 8.2031e-01 -2.9346e-01 4.4766e+00 2.5371e+00 2.4124e-02 -1.3092e-02 4.4727e+00 2.6953e-01 -8.6035e-01 -3.2285e+00 1.8896e+00 -1.0195e+00 -2.6230e+00 -9.7229e-02 -1.2959e+00 2.4023e+00 6.6943e-01 3.6484e+00 1.3232e+00 1.5391e+00 -2.1738e+00 -4.6250e+00 9.6387e-01 -4.8309e-02 -3.4453e+00 -4.9414e-01 -1.5977e+00 -3.5742e+00 8.1494e-01 -2.6680e+00 -3.7930e+00 -3.3242e+00 4.2344e+00] y = [-2.4648e+00 3.8828e+00 3.4785e+00 -1.5967e+00 -1.1665e-02 1.9775e+00 -3.8066e+00 -1.2441e+00 -3.4160e+00 -1.0840e+00 -1.7227e+00 6.4697e-01 -5.3789e+00 1.3115e+00 -3.3438e+00 1.0410e+00 -1.9033e+00 -1.1211e+00 3.3652e+00 -4.6484e+00 4.9927e-01 6.1523e-01 -2.7686e-01 -3.0723e+00 -1.4844e+00 6.0938e+00 -3.0645e+00 4.5078e+00 5.5234e+00 1.4385e+00 -4.0977e+00 1.8418e+00 -8.9111e-02 -2.0352e+00 -3.7402e+00 1.0820e+00 -3.2562e-02 -8.8562e-02 -3.9819e-01 -8.7988e-01 2.8340e+00 3.2930e+00 -3.2070e+00 7.8662e-01 -2.7324e+00 1.0938e+00 -5.8301e-01 -1.2324e+00 -4.6851e-01 2.8369e-01 -9.3365e-04 1.2500e-01 -4.8789e+00 -4.5776e-01 -1.9814e+00 2.9590e-01 -6.3037e-01 -6.4160e-01 3.4785e+00 3.5020e+00 -2.6270e+00 -1.9910e-01 -2.1895e+00 2.5625e+00 3.7676e+00 -2.4668e+00 8.3057e-01 4.9219e+00 -2.3477e+00 1.2812e+00 3.1445e-01 -4.9530e-02 3.6602e+00 2.7051e+00 5.1953e+00 1.9067e-01 3.2285e+00 -1.9082e+00 1.9258e+00 1.5898e+00 -1.9326e+00 -2.3145e+00 3.6582e+00 1.1709e+00 1.6240e+00 -2.1106e-01 -1.8828e+00 -4.5352e+00 1.8438e+00 6.8701e-01 2.0000e+00 -2.5078e+00 -4.1968e-01 1.9160e+00 -4.5728e-01 1.8994e+00 -2.3301e+00 2.0645e+00 1.6846e+00 2.2344e+00 -1.8301e+00 -9.3896e-01 6.8945e-01 4.1836e+00 -4.4727e-01 2.8730e+00 2.0625e+00 9.8047e+00 -3.2031e+00 -2.5020e+00 -1.4199e+00 -2.7188e+00 -5.0684e-01 -3.3105e+00 2.1250e+00 -1.2490e+00 3.0078e-01 -2.1406e+00 -2.2480e+00 2.1797e+00 4.1172e+00 -2.8906e+00 -6.3770e-01 -4.3906e+00 -3.5742e+00 -1.3467e+00 1.9297e+00 8.0762e-01 -2.8589e-01 4.4727e+00 2.5332e+00 2.6352e-02 -7.0305e-03 4.4844e+00 2.7637e-01 -8.5938e-01 -3.2207e+00 1.8906e+00 -1.0117e+00 -2.6094e+00 -1.1090e-01 -1.3066e+00 2.3926e+00 6.6846e-01 3.6426e+00 1.3223e+00 1.5498e+00 -2.1895e+00 -4.6211e+00 9.6143e-01 -4.8523e-02 -3.4316e+00 -4.8584e-01 -1.5879e+00 -3.5664e+00 8.1055e-01 -2.6777e+00 -3.8008e+00 -3.3359e+00 4.2383e+00] 32.863705704610005 32.86973879444469 -------------------------------------------------------------------------------- checking O_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4080454063415528 x = [-1.128 -2.844 -0.8193 1.815 0.4023 1.086 -0.776 -3.846 -3.686 -2.729 -1.571 2.455 1.404 -3.465 2.258 -1.257 3.73 -2.275 1.438 -0.169 0.337 3.158 -2.195 -0.08984 2.15 3.457 -2.225 6.355 5.76 1.9 -4.188 -0.616 -3.107 -3.467 -1.211 -3.07 1.231 -3.633 2.727 1.051 -1.677 3.842 3.268 2.729 -4.266 -1.402 -0.638 4.266 0.295 -0.7847 1.577 -2.621 0.7134 1.358 -1.077 -2.227 -0.6484 -0.2114 -1.431 -2.11 2.297 0.809 0.8296 3.05 -0.014755 -3.666 -0.811 -0.6025 -1.916 4.496 -0.7036 -0.969 1.311 -2.316 0.9927 -0.7456 -0.4604 1.311 2.105 3.77 0.6455 -2.33 -4.18 1.911 1.113 -0.0757 2.742 -4.258 2.285 -3.477 0.2366 -1.338 -1.1 -0.5522 -1.545 -0.5327 3.37 2.447 -2.473 3.633 1.546 0.706 1.442 -2.914 0.7007 0.7075 -0.176 0.651 5.46 -2.297 2.426 -0.8477 -0.3103 -3.016 -0.1566 0.1252 1.724 3.78 0.2805 2.88 -2.191 1.224 -0.2245 0.0246 -0.97 -0.4348 -1.048 0.1946 1.185 -5.66 -5.83 4.637 -1.781 -1.709 -2.844 4.46 -0.334 0.595 4.47 -2.645 -5.594 -4.652 1.478 1.184 -3.719 2.29 2.137 2.404 -2.291 0.776 -0.8013 -1.058 -1.775 -3.11 -3.33 4.555 0.6157 5.08 2.42 0.922 ] y = [-1.129 -2.836 -0.814 1.812 0.4094 1.088 -0.7773 -3.85 -3.686 -2.734 -1.563 2.451 1.409 -3.455 2.254 -1.26 3.732 -2.275 1.437 -0.1694 0.3267 3.16 -2.197 -0.0834 2.156 3.46 -2.22 6.36 5.758 1.899 -4.19 -0.6113 -3.107 -3.465 -1.208 -3.074 1.228 -3.627 2.727 1.052 -1.67 3.838 3.262 2.727 -4.26 -1.402 -0.6313 4.26 0.298 -0.7886 1.571 -2.615 0.7188 1.363 -1.086 -2.225 -0.6475 -0.2133 -1.432 -2.104 2.3 0.8057 0.826 3.055 -0.01511 -3.662 -0.8076 -0.592 -1.915 4.492 -0.699 -0.9717 1.316 -2.316 0.997 -0.7446 -0.469 1.303 2.104 3.77 0.648 -2.338 -4.18 1.913 1.117 -0.0674 2.742 -4.254 2.287 -3.47 0.2445 -1.338 -1.098 -0.5537 -1.543 -0.5356 3.379 2.447 -2.47 3.63 1.546 0.707 1.447 -2.918 0.708 0.702 -0.1703 0.652 5.465 -2.297 2.432 -0.847 -0.31 -3.021 -0.1544 0.1217 1.714 3.773 0.2825 2.875 -2.193 1.214 -0.2203 0.014824 -0.968 -0.4355 -1.044 0.1987 1.188 -5.66 -5.836 4.64 -1.783 -1.71 -2.84 4.453 -0.3408 0.5957 4.457 -2.645 -5.59 -4.645 1.474 1.183 -3.719 2.295 2.137 2.395 -2.29 0.7715 -0.804 -1.059 -1.775 -3.105 -3.33 4.56 0.6113 5.08 2.422 0.916 ] 31.766118911034955 31.75844259074065 -------------------------------------------------------------------------------- checking O_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.02884192890767008 x = [ 0.0815 -0.2202 0.1494 ... -0.273 -0.319 -0.06494] y = [ 0.0826 -0.2195 0.15 ... -0.272 -0.3162 -0.0663] 29.28342458005214 29.2897007023768 -------------------------------------------------------------------------------- checking V_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.09806513547897339 x = [ 1.547 0.284 0.8706 0.569 0.1329 0.679 -0.503 0.3162 -0.09015 -0.3525 0.766 -0.2966 0.5386 -0.2454 -0.6523 -0.02632 -0.404 0.9854 -0.3232 -0.7524 -0.4045 -0.9297 0.664 0.774 -0.2393 -0.4465 -0.5254 0.2288 -0.2167 1.237 -0.1667 1.166 -0.1703 -0.3704 0.6475 -0.1927 0.3538 -0.99 1.61 0.308 -0.1315 -0.003256 0.2062 0.2421 -0.6816 -0.4382 0.02731 -0.6084 0.03574 -0.2834 -0.4495 0.2827 -0.3726 -0.897 0.5405 -0.01904 0.6167 0.4463 -0.4812 0.2393 0.03403 -0.3967 -1.865 -0.516 -0.903 0.307 0.353 -0.3635 -0.9546 0.3518 0.8286 0.746 0.4832 -0.1301 1.227 0.1492 -1.227 0.02693 0.1776 0.874 0.3367 0.439 0.8813 0.6445 0.3723 -0.931 -0.03018 0.06665 0.07043 -0.567 -0.594 -0.342 -0.292 -0.05756 -0.2861 0.575 0.4324 0.2189 -0.656 0.7383 -0.3833 -0.9033 -0.07916 0.4026 -0.5464 -1.449 0.5405 -0.562 1.226 -0.9434 -0.3354 0.1575 -0.7344 -0.486 -0.9985 0.2734 0.381 -0.7837 0.559 -0.8994 0.1 0.1702 -0.54 0.3782 -0.08344 -0.139 0.669 -0.06744 0.173 -1.47 -1.038 0.1422 -0.599 -0.07086 -0.2898 -0.323 -0.3179 0.2544 0.04645 0.282 -0.456 -1.835 -0.736 -0.02435 -0.2678 -0.867 0.3591 0.2068 -0.226 -0.02887 0.7554 -0.01395 0.5596 0.0955 -0.3696 0.1487 0.1368 -1.032 -0.1633 -0.9478 ] y = [ 1.5488e+00 2.8589e-01 8.7109e-01 5.6836e-01 1.3220e-01 6.7920e-01 -5.0732e-01 3.1641e-01 -8.8501e-02 -3.4985e-01 7.6416e-01 -2.9517e-01 5.3564e-01 -2.4512e-01 -6.5234e-01 -2.7161e-02 -4.0479e-01 9.8486e-01 -3.2251e-01 -7.5195e-01 -3.9990e-01 -9.2822e-01 6.6699e-01 7.7148e-01 -2.4109e-01 -4.4312e-01 -5.2637e-01 2.2864e-01 -2.1375e-01 1.2412e+00 -1.6614e-01 1.1631e+00 -1.7065e-01 -3.7109e-01 6.4600e-01 -1.9189e-01 3.5303e-01 -9.8926e-01 1.6123e+00 3.0713e-01 -1.3074e-01 -1.2684e-03 2.0776e-01 2.4280e-01 -6.8311e-01 -4.3970e-01 2.7679e-02 -6.0840e-01 3.3386e-02 -2.8369e-01 -4.4678e-01 2.8418e-01 -3.7256e-01 -8.9844e-01 5.4053e-01 -1.5884e-02 6.1670e-01 4.4434e-01 -4.8169e-01 2.4109e-01 3.5004e-02 -3.9722e-01 -1.8643e+00 -5.1660e-01 -9.0234e-01 3.0493e-01 3.5352e-01 -3.6377e-01 -9.5361e-01 3.5303e-01 8.3008e-01 7.4951e-01 4.8145e-01 -1.2952e-01 1.2256e+00 1.4868e-01 -1.2266e+00 2.7679e-02 1.7615e-01 8.7598e-01 3.3765e-01 4.3726e-01 8.8135e-01 6.4600e-01 3.7329e-01 -9.2822e-01 -3.0563e-02 7.0251e-02 6.9458e-02 -5.7031e-01 -5.9473e-01 -3.4155e-01 -2.9468e-01 -5.7617e-02 -2.8394e-01 5.7422e-01 4.3408e-01 2.1777e-01 -6.5625e-01 7.3193e-01 -3.8428e-01 -9.0186e-01 -8.0811e-02 4.0308e-01 -5.4834e-01 -1.4482e+00 5.4004e-01 -5.6152e-01 1.2246e+00 -9.4141e-01 -3.3594e-01 1.5698e-01 -7.3633e-01 -4.8633e-01 -9.9756e-01 2.7319e-01 3.8037e-01 -7.8076e-01 5.5957e-01 -8.9844e-01 9.8877e-02 1.6931e-01 -5.3809e-01 3.7695e-01 -8.3191e-02 -1.4062e-01 6.6650e-01 -6.8604e-02 1.7383e-01 -1.4717e+00 -1.0391e+00 1.4600e-01 -6.0010e-01 -6.9153e-02 -2.8833e-01 -3.2520e-01 -3.1714e-01 2.5171e-01 4.5441e-02 2.8027e-01 -4.5605e-01 -1.8330e+00 -7.3730e-01 -2.2644e-02 -2.6685e-01 -8.6865e-01 3.6060e-01 2.0679e-01 -2.2522e-01 -2.9907e-02 7.5391e-01 -1.3016e-02 5.5811e-01 9.5154e-02 -3.6914e-01 1.4917e-01 1.3696e-01 -1.0332e+00 -1.6382e-01 -9.4971e-01] 7.8439417861507925 7.842749161825288 -------------------------------------------------------------------------------- checking V_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.028768823503050957 x = [-0.1787 0.2095 -0.01067 ... -0.4077 -0.03275 0.05927] y = [-0.1788 0.2135 -0.011536 ... -0.4124 -0.03152 0.05814 ] 29.431354173218313 29.44953311853899 -------------------------------------------------------------------------------- checking hidden_state : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.00465886491886522 x = [-0.0122 0.01778 0.03528 ... 0.002499 -0.0001941 0.0277 ] y = [-0.01218 0.01776 0.0353 ... 0.002495 -0.0002124 0.0277 ] 33.43474253892601 33.43494272705062 -------------------------------------------------------------------------------- checking norm_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4153206157684326 x = [-1.022 -2.594 -0.859 1.83 0.513 1.178 -1.145 -3.969 -3.602 -2.902 -1.676 2.717 1.059 -3.547 2.568 -1.51 3.64 -2.46 1.162 -0.2542 0.3718 3.404 -2.018 -0.238 2.32 3.807 -2.18 6.29 5.668 1.82 -4.117 -0.8423 -2.717 -3.432 -1.114 -3.314 1.323 -3.64 2.824 1.064 -1.719 3.85 3.393 2.684 -4.69 -1.546 -0.57 4.11 0.4685 -0.5767 1.631 -2.66 0.782 1.207 -1.083 -2.414 -0.6094 -0.4573 -1.591 -1.864 2.348 0.635 0.588 3.266 0.00924 -3.766 -0.812 -0.5356 -1.742 4.676 -0.831 -0.8145 1.204 -2.244 0.8994 -0.7974 -0.5713 1.3 2.018 3.852 0.713 -2.582 -4.324 1.975 1.333 0.11804 3.162 -4.387 2.418 -3.857 0.3206 -1.478 -1.319 -0.7466 -1.59 -0.755 3.623 2.414 -2.496 3.348 1.366 0.841 1.061 -2.787 0.7007 0.7324 -0.1989 0.8667 5.664 -2.264 2.645 -0.9297 -0.4504 -2.994 -0.12366 -0.03906 1.75 3.877 0.102 3.102 -2.086 1.206 -0.3342 -0.2009 -0.493 -0.529 -0.937 0.2069 0.9175 -5.77 -5.785 4.88 -1.914 -1.596 -2.73 5.02 -0.273 0.568 4.555 -2.926 -5.664 -4.445 1.391 1.402 -3.707 2.47 2.273 2.24 -2.324 0.4841 -0.9067 -1.047 -1.46 -3.14 -3.244 4.645 0.517 5.145 2.762 1.087 ] y = [-1.024 -2.586 -0.8506 1.827 0.5205 1.179 -1.1455 -3.973 -3.604 -2.91 -1.67 2.713 1.062 -3.537 2.564 -1.513 3.646 -2.46 1.161 -0.2534 0.3616 3.404 -2.021 -0.2307 2.326 3.812 -2.176 6.293 5.664 1.82 -4.117 -0.838 -2.715 -3.432 -1.112 -3.322 1.318 -3.635 2.826 1.065 -1.714 3.844 3.387 2.684 -4.69 -1.547 -0.5645 4.105 0.4702 -0.5806 1.627 -2.652 0.7876 1.213 -1.095 -2.41 -0.609 -0.4607 -1.592 -1.857 2.35 0.633 0.584 3.268 0.01009 -3.762 -0.808 -0.5244 -1.74 4.67 -0.8267 -0.816 1.212 -2.246 0.904 -0.7974 -0.5796 1.291 2.018 3.854 0.716 -2.592 -4.324 1.978 1.335 0.1267 3.162 -4.383 2.42 -3.852 0.3298 -1.476 -1.319 -0.7495 -1.588 -0.758 3.633 2.416 -2.494 3.344 1.367 0.8423 1.065 -2.791 0.707 0.7256 -0.192 0.8677 5.668 -2.264 2.65 -0.9277 -0.4521 -3.002 -0.1217 -0.04355 1.739 3.871 0.1028 3.094 -2.09 1.194 -0.3293 -0.2135 -0.4902 -0.529 -0.933 0.2102 0.9185 -5.77 -5.793 4.883 -1.917 -1.597 -2.729 5.016 -0.2812 0.5693 4.543 -2.926 -5.656 -4.434 1.388 1.399 -3.707 2.475 2.273 2.23 -2.318 0.4802 -0.91 -1.047 -1.457 -3.139 -3.244 4.645 0.5137 5.145 2.764 1.083 ] 32.34925167746383 32.342609056165486 -------------------------------------------------------------------------------- checking norm_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.41616420745849614 x = [-2.383 4. 3.467 -1.686 0.01889 1.986 -3.732 -1.214 -3.41 -0.9614 -1.724 0.6797 -5.36 1.288 -3.389 1.08 -1.881 -1.085 3.316 -4.65 0.4348 0.612 -0.2137 -3.082 -1.485 6.113 -3.096 4.535 5.5 1.395 -4.047 1.909 -0.0644 -1.95 -3.777 1.118 -0.09094 -0.0972 -0.422 -0.795 2.791 3.287 -3.27 0.8164 -2.588 0.9814 -0.6606 -1.303 -0.535 0.2406 -0.01457 0.148 -4.918 -0.4172 -2.025 0.2244 -0.571 -0.646 3.47 3.53 -2.658 -0.2502 -2.105 2.559 3.857 -2.402 0.8057 4.87 -2.35 1.263 0.3845 0.01143 3.547 2.766 5.207 0.10394 3.236 -1.931 1.953 1.668 -1.92 -2.26 3.58 1.108 1.655 -0.2976 -1.881 -4.574 1.826 0.685 2. -2.53 -0.4172 1.8955 -0.4846 1.94 -2.275 1.961 1.621 2.275 -1.909 -0.8696 0.79 4.21 -0.4656 2.934 2.016 9.76 -3.215 -2.525 -1.512 -2.686 -0.524 -3.156 2.023 -1.303 0.3333 -2.107 -2.182 2.26 4.043 -2.846 -0.6777 -4.387 -3.566 -1.408 1.991 0.833 -0.3765 4.47 2.604 -0.04953 -0.06088 4.527 0.3142 -0.7856 -3.166 1.867 -1.109 -2.56 -0.2563 -1.408 2.477 0.6973 3.637 1.369 1.628 -2.11 -4.645 0.9663 -0.0736 -3.553 -0.512 -1.485 -3.482 0.774 -2.639 -3.672 -3.318 4.21 ] y = [-2.3828e+00 3.9863e+00 3.4590e+00 -1.6797e+00 1.4076e-02 1.9775e+00 -3.7363e+00 -1.2148e+00 -3.3887e+00 -9.5410e-01 -1.7236e+00 6.7578e-01 -5.3555e+00 1.2783e+00 -3.3906e+00 1.0859e+00 -1.8848e+00 -1.0889e+00 3.3281e+00 -4.6406e+00 4.3359e-01 6.1426e-01 -2.0630e-01 -3.0840e+00 -1.4863e+00 6.1094e+00 -3.0820e+00 4.5430e+00 5.4961e+00 1.3828e+00 -4.0469e+00 1.9141e+00 -6.6345e-02 -1.9502e+00 -3.7891e+00 1.1182e+00 -9.2590e-02 -9.8694e-02 -4.1211e-01 -7.9688e-01 2.7852e+00 3.2969e+00 -3.2832e+00 8.2129e-01 -2.5781e+00 9.7559e-01 -6.6992e-01 -1.3154e+00 -5.3223e-01 2.4377e-01 -7.5111e-03 1.4954e-01 -4.9141e+00 -4.2529e-01 -2.0312e+00 2.2522e-01 -5.6396e-01 -6.5186e-01 3.4688e+00 3.5312e+00 -2.6621e+00 -2.5000e-01 -2.1074e+00 2.5605e+00 3.8770e+00 -2.4102e+00 8.1250e-01 4.8906e+00 -2.3398e+00 1.2617e+00 3.7769e-01 2.3270e-02 3.5332e+00 2.7695e+00 5.2188e+00 1.0004e-01 3.2324e+00 -1.9365e+00 1.9541e+00 1.6592e+00 -1.9170e+00 -2.2539e+00 3.5781e+00 1.1182e+00 1.6348e+00 -2.9150e-01 -1.8740e+00 -4.5898e+00 1.8213e+00 6.7920e-01 1.9961e+00 -2.5215e+00 -4.1406e-01 1.8965e+00 -4.7021e-01 1.9395e+00 -2.2852e+00 1.9609e+00 1.6260e+00 2.2676e+00 -1.9141e+00 -8.7646e-01 7.8271e-01 4.2070e+00 -4.6338e-01 2.9434e+00 2.0098e+00 9.7656e+00 -3.2227e+00 -2.5137e+00 -1.5342e+00 -2.6836e+00 -5.1074e-01 -3.1504e+00 2.0312e+00 -1.3027e+00 3.3179e-01 -2.1172e+00 -2.1855e+00 2.2598e+00 4.0352e+00 -2.8418e+00 -6.6504e-01 -4.3828e+00 -3.5566e+00 -1.4014e+00 1.9932e+00 8.2324e-01 -3.6963e-01 4.4609e+00 2.5977e+00 -4.9835e-02 -5.4138e-02 4.5352e+00 3.2178e-01 -7.8857e-01 -3.1582e+00 1.8672e+00 -1.0996e+00 -2.5488e+00 -2.7075e-01 -1.4170e+00 2.4688e+00 6.9629e-01 3.6289e+00 1.3682e+00 1.6396e+00 -2.1230e+00 -4.6406e+00 9.6387e-01 -6.9824e-02 -3.5371e+00 -5.0586e-01 -1.4766e+00 -3.4766e+00 7.7197e-01 -2.6465e+00 -3.6777e+00 -3.3320e+00 4.2148e+00] 32.81144996619825 32.810452762332424 -------------------------------------------------------------------------------- checking out_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4076111698150635 x = [-0.99 -2.55 -0.845 1.821 0.4822 1.144 -1.128 -3.895 -3.549 -2.848 -1.649 2.645 1.029 -3.482 2.52 -1.482 3.566 -2.422 1.141 -0.2311 0.3533 3.338 -1.99 -0.2344 2.271 3.746 -2.16 6.168 5.547 1.785 -4.047 -0.825 -2.674 -3.357 -1.096 -3.242 1.312 -3.586 2.77 1.033 -1.69 3.77 3.35 2.629 -4.62 -1.529 -0.57 4.016 0.4512 -0.5776 1.595 -2.605 0.756 1.17 -1.074 -2.375 -0.6016 -0.4382 -1.56 -1.823 2.297 0.6143 0.569 3.207 0.00785 -3.684 -0.792 -0.543 -1.701 4.582 -0.817 -0.799 1.203 -2.195 0.891 -0.776 -0.5566 1.287 1.975 3.777 0.7085 -2.525 -4.246 1.935 1.29 0.11914 3.088 -4.31 2.373 -3.797 0.2969 -1.459 -1.298 -0.7246 -1.57 -0.741 3.553 2.357 -2.443 3.297 1.336 0.8184 1.031 -2.746 0.6895 0.7183 -0.2024 0.836 5.562 -2.227 2.588 -0.8926 -0.4475 -2.934 -0.1381 -0.04007 1.72 3.795 0.0979 3.05 -2.062 1.185 -0.3237 -0.2087 -0.4692 -0.524 -0.9155 0.2102 0.898 -5.652 -5.688 4.81 -1.883 -1.563 -2.686 4.926 -0.2737 0.5405 4.46 -2.867 -5.57 -4.363 1.356 1.376 -3.646 2.414 2.217 2.205 -2.285 0.4836 -0.902 -1.016 -1.443 -3.11 -3.193 4.58 0.5083 5.07 2.713 1.064 ] y = [-0.994 -2.545 -0.84 1.819 0.4849 1.145 -1.13 -3.9 -3.555 -2.855 -1.644 2.64 1.034 -3.475 2.516 -1.485 3.574 -2.422 1.141 -0.2311 0.343 3.34 -1.991 -0.2281 2.277 3.75 -2.158 6.17 5.543 1.784 -4.047 -0.823 -2.674 -3.361 -1.093 -3.25 1.305 -3.582 2.771 1.031 -1.687 3.768 3.342 2.63 -4.62 -1.532 -0.565 4.016 0.4556 -0.584 1.592 -2.598 0.7627 1.176 -1.085 -2.373 -0.601 -0.4421 -1.564 -1.818 2.3 0.6133 0.564 3.207 0.01029 -3.682 -0.7896 -0.534 -1.699 4.582 -0.8135 -0.801 1.21 -2.2 0.894 -0.776 -0.566 1.277 1.973 3.777 0.7085 -2.535 -4.246 1.934 1.295 0.1301 3.084 -4.31 2.373 -3.795 0.3057 -1.458 -1.296 -0.729 -1.568 -0.7476 3.562 2.357 -2.441 3.295 1.338 0.819 1.036 -2.75 0.6943 0.7124 -0.1936 0.8384 5.566 -2.227 2.594 -0.893 -0.449 -2.941 -0.1357 -0.04428 1.709 3.79 0.0962 3.043 -2.068 1.173 -0.319 -0.2225 -0.466 -0.522 -0.9136 0.2085 0.898 -5.656 -5.7 4.81 -1.889 -1.568 -2.684 4.918 -0.2793 0.5464 4.453 -2.871 -5.57 -4.355 1.353 1.373 -3.646 2.42 2.219 2.195 -2.281 0.4812 -0.903 -1.017 -1.441 -3.104 -3.195 4.582 0.5044 5.074 2.717 1.061 ] 31.758262255291015 31.761453330024494 -------------------------------------------------------------------------------- checking out_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.05387182788876817 x = [-0.0114 -0.0846 0.02495 ... 0.1688 0.11523 -0.1267 ] y = [-0.00824 -0.08923 0.02412 ... 0.1682 0.1188 -0.12354] 54.515073836378676 54.50788491650227 -------------------------------------------------------------------------------- checking int_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.0593267971277237 x = [ 1.4600e-01 -2.6904e-01 2.1936e-01 1.6296e-01 2.0435e-01 -7.5732e-01 -4.4220e-02 -8.0299e-04 -7.4512e-01 7.1533e-01 -4.0015e-01 1.1127e-01 -1.3818e-01 -1.1829e-01 3.5620e-01 -1.1713e-01 -3.4570e-01 -2.2021e-01 -1.1615e-01 -7.4219e-01 9.0723e-01 2.8549e-02 -1.3220e-01 -2.7710e-01 -1.5869e-01 2.3938e-01 -4.4727e-01 -2.3352e-01 5.3955e-01 -1.2720e-01 -2.8174e-01 -1.9885e-01 -1.0840e+00 -3.4204e-01 5.5695e-02 5.1611e-01 2.3010e-01 -2.7246e-01 8.9307e-01 -4.2896e-01 -2.6440e-01 -7.1239e-04 -8.6121e-02 -2.7734e-01 6.1279e-02 2.6904e-01 -1.6431e-01 7.9102e-01 -4.7534e-01 -8.3301e-01 -5.3613e-01 -4.8096e-01 -3.6304e-01 7.9651e-02 -4.5441e-02 -7.7393e-01 -2.6318e-01 -2.5830e-01 -1.5942e-01 1.6443e-01 -8.7402e-02 -3.1396e-01 -2.5879e-01 7.0215e-01 2.7002e-01 2.2742e-01 1.7419e-01 3.4515e-02 -1.5039e-01 -6.3428e-01 -1.9934e-01 8.2275e-01 3.3512e-03 3.8623e-01 -3.9282e-01 2.5220e-01 -5.4199e-01 3.4229e-01 1.8396e-01 -2.7881e-01 2.4231e-01 -1.9250e-01 -2.3474e-01 2.4878e-01 -1.9885e-01 -1.0278e-01 2.0679e-01 3.2324e-01 2.6367e-01 -4.8267e-01 3.4961e-01 -7.7477e-03 7.4341e-02 -2.9639e-01 1.8530e-01 -8.2764e-02 -2.6489e-01 -3.3600e-02 -3.9941e-01 -1.6479e-01 2.1777e-01 -1.1023e-01 2.3035e-01 -4.5215e-01 -2.3267e-01 -5.7422e-01 -1.5833e-01 2.1313e-01 2.7832e-02 -7.5256e-02 -3.8379e-01 -5.6580e-02 -3.8745e-01 -2.2632e-01 6.5332e-01 -4.0210e-01 -3.0469e-01 -1.6931e-01 2.9199e-01 -7.8564e-01 -7.6050e-02 -3.7817e-01 2.0300e-01 2.0972e-01 -4.8737e-02 -5.7275e-01 5.3467e-01 2.7954e-01 -3.0289e-02 -3.0225e-01 -4.3652e-01 4.8975e-01 4.3115e-01 5.5237e-02 4.2334e-01 5.2441e-01 -3.3594e-01 -7.4951e-01 3.8379e-01 -7.7881e-02 -2.3438e-01 -3.2495e-01 6.7932e-02 1.5344e-01 4.3732e-02 -7.8369e-01 -9.3384e-02 1.2262e-01 2.5464e-01 -1.7346e-01 2.5146e-01 -6.4201e-03 4.1723e-04 2.5562e-01 2.7393e-01 1.7004e-01 -3.1921e-02 -2.3181e-01 6.5771e-01 5.5127e-01] y = [ 1.4478e-01 -2.7026e-01 2.2034e-01 1.6296e-01 2.0349e-01 -7.5879e-01 -4.5715e-02 -9.2793e-04 -7.4561e-01 7.1484e-01 -4.0161e-01 1.1115e-01 -1.3855e-01 -1.1902e-01 3.5767e-01 -1.1548e-01 -3.4497e-01 -2.2217e-01 -1.1700e-01 -7.4365e-01 9.0723e-01 2.8778e-02 -1.3196e-01 -2.7539e-01 -1.5771e-01 2.3926e-01 -4.4800e-01 -2.3022e-01 5.4004e-01 -1.2671e-01 -2.8247e-01 -1.9849e-01 -1.0850e+00 -3.4253e-01 5.5328e-02 5.1758e-01 2.2961e-01 -2.7075e-01 8.9453e-01 -4.2749e-01 -2.6489e-01 -1.4031e-04 -8.7036e-02 -2.7808e-01 6.1737e-02 2.6880e-01 -1.6431e-01 7.8955e-01 -4.7485e-01 -8.3154e-01 -5.3564e-01 -4.8169e-01 -3.6206e-01 7.8918e-02 -4.6204e-02 -7.7637e-01 -2.6245e-01 -2.5952e-01 -1.5979e-01 1.6235e-01 -8.7280e-02 -3.1274e-01 -2.5879e-01 7.0264e-01 2.7051e-01 2.2852e-01 1.7395e-01 3.4393e-02 -1.4966e-01 -6.3281e-01 -1.9885e-01 8.2275e-01 3.8528e-03 3.8672e-01 -3.9136e-01 2.5098e-01 -5.4199e-01 3.4204e-01 1.8494e-01 -2.7930e-01 2.4316e-01 -1.9409e-01 -2.3511e-01 2.4890e-01 -1.9922e-01 -1.0022e-01 2.0630e-01 3.2300e-01 2.6343e-01 -4.8291e-01 3.5010e-01 -7.4005e-03 7.5012e-02 -2.9565e-01 1.8701e-01 -8.3130e-02 -2.6611e-01 -3.6102e-02 -3.9941e-01 -1.6296e-01 2.1790e-01 -1.0962e-01 2.3206e-01 -4.5166e-01 -2.3242e-01 -5.7520e-01 -1.6064e-01 2.1179e-01 2.9938e-02 -7.5806e-02 -3.8477e-01 -5.4413e-02 -3.8550e-01 -2.2546e-01 6.5381e-01 -4.0283e-01 -3.0469e-01 -1.6797e-01 2.9175e-01 -7.8662e-01 -7.5012e-02 -3.7842e-01 2.0300e-01 2.1106e-01 -4.7760e-02 -5.7373e-01 5.3369e-01 2.7930e-01 -3.1830e-02 -3.0225e-01 -4.3701e-01 4.8926e-01 4.3018e-01 5.6488e-02 4.2480e-01 5.2539e-01 -3.3594e-01 -7.4951e-01 3.8501e-01 -7.6904e-02 -2.3438e-01 -3.2422e-01 6.7444e-02 1.5381e-01 4.4098e-02 -7.8369e-01 -9.5154e-02 1.2231e-01 2.5586e-01 -1.7395e-01 2.5146e-01 -7.0724e-03 -1.9956e-04 2.5439e-01 2.7490e-01 1.6907e-01 -3.2349e-02 -2.3291e-01 6.5674e-01 5.4980e-01] 4.693048368111672 4.694186832687191 -------------------------------------------------------------------------------- checking int_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.05672340871719644 x = [ 0.809 -0.1863 -0.8345 ... 0.4211 -0.2434 0.542 ] y = [ 0.808 -0.1892 -0.8413 ... 0.42 -0.2396 0.54 ] 57.065366244229594 57.06082171967528 -------------------------------------------------------------------------------- checking N2_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4081954300403595 x = [-9.4482e-01 -2.3516e+00 -7.9346e-01 1.8076e+00 5.1514e-01 1.1455e+00 -1.1660e+00 -4.0195e+00 -3.4551e+00 -2.7793e+00 -1.5537e+00 2.6855e+00 1.0596e+00 -3.4492e+00 2.4219e+00 -1.4785e+00 3.6055e+00 -2.2285e+00 1.1885e+00 -3.9917e-01 4.2944e-01 3.4355e+00 -1.9668e+00 -3.4595e-01 2.3672e+00 3.8477e+00 -1.8545e+00 6.3008e+00 5.5742e+00 1.6846e+00 -4.0391e+00 -9.6631e-01 -2.5430e+00 -3.3438e+00 -1.1006e+00 -3.1523e+00 1.2568e+00 -3.6406e+00 2.6133e+00 1.0918e+00 -1.6592e+00 3.9141e+00 3.3945e+00 2.6816e+00 -4.5625e+00 -1.4453e+00 -5.1709e-01 4.1016e+00 4.1113e-01 -4.1992e-01 1.7500e+00 -2.6230e+00 9.5166e-01 1.2285e+00 -1.0000e+00 -2.3477e+00 -4.6533e-01 -2.8882e-01 -1.4990e+00 -1.6914e+00 2.3691e+00 6.9531e-01 7.0557e-01 3.2930e+00 -2.4377e-01 -3.7012e+00 -8.5742e-01 -3.8354e-01 -1.6895e+00 4.6133e+00 -8.3447e-01 -9.2725e-01 1.1748e+00 -2.1641e+00 9.2627e-01 -7.0361e-01 -4.7803e-01 1.3770e+00 2.1270e+00 3.7832e+00 6.8750e-01 -2.5195e+00 -4.2461e+00 2.0176e+00 1.3281e+00 3.5191e-04 2.9922e+00 -4.1562e+00 2.4277e+00 -3.8926e+00 3.4766e-01 -1.3037e+00 -1.3271e+00 -5.0342e-01 -1.5840e+00 -6.6748e-01 3.6621e+00 2.3164e+00 -2.6289e+00 3.4336e+00 1.2178e+00 8.5400e-01 8.9600e-01 -2.8613e+00 5.9277e-01 8.6133e-01 -1.9995e-01 6.0986e-01 5.4414e+00 -2.2109e+00 2.6992e+00 -9.2676e-01 -4.5703e-01 -2.9199e+00 -5.9692e-02 -1.0582e-02 1.8018e+00 3.7305e+00 1.5540e-01 3.1562e+00 -1.9775e+00 1.2236e+00 -5.0928e-01 -2.0569e-01 -5.5518e-01 -4.2261e-01 -9.8193e-01 2.3096e-01 9.1602e-01 -5.6797e+00 -5.5859e+00 4.7266e+00 -1.8564e+00 -1.5566e+00 -2.8164e+00 4.8828e+00 -2.9492e-01 5.6299e-01 4.5312e+00 -2.9473e+00 -5.6055e+00 -4.5156e+00 1.4326e+00 1.3154e+00 -3.6270e+00 2.4727e+00 2.2070e+00 2.1699e+00 -2.4023e+00 4.9292e-01 -8.8770e-01 -1.1592e+00 -1.4932e+00 -3.1191e+00 -3.0000e+00 4.6445e+00 5.0488e-01 4.9766e+00 2.7539e+00 1.0059e+00] y = [-0.9487 -2.346 -0.789 1.805 0.5186 1.147 -1.168 -4.023 -3.46 -2.787 -1.547 2.682 1.064 -3.443 2.416 -1.482 3.611 -2.229 1.189 -0.4004 0.4197 3.438 -1.968 -0.34 2.371 3.854 -1.853 6.31 5.57 1.686 -4.043 -0.9644 -2.541 -3.348 -1.097 -3.16 1.248 -3.637 2.613 1.09 -1.654 3.91 3.389 2.684 -4.562 -1.448 -0.512 4.1 0.4153 -0.4258 1.747 -2.615 0.9585 1.235 -1.011 -2.344 -0.4644 -0.2922 -1.503 -1.687 2.373 0.695 0.702 3.293 -0.2417 -3.7 -0.856 -0.3765 -1.6875 4.613 -0.8306 -0.9297 1.182 -2.168 0.9297 -0.704 -0.4875 1.368 2.125 3.785 0.6885 -2.527 -4.246 2.018 1.335 0.01073 2.988 -4.156 2.428 -3.89 0.3562 -1.303 -1.325 -0.507 -1.583 -0.6743 3.672 2.316 -2.627 3.432 1.219 0.8555 0.901 -2.867 0.598 0.8555 -0.1906 0.6123 5.445 -2.21 2.707 -0.9263 -0.459 -2.93 -0.0567 -0.01493 1.793 3.725 0.1544 3.146 -1.982 1.211 -0.5034 -0.2195 -0.5522 -0.4202 -0.9795 0.229 0.9146 -5.68 -5.594 4.727 -1.861 -1.562 -2.814 4.87 -0.2998 0.57 4.523 -2.951 -5.6 -4.508 1.43 1.3125 -3.627 2.48 2.207 2.156 -2.398 0.4907 -0.8887 -1.16 -1.491 -3.113 -3.002 4.65 0.5005 4.98 2.758 1.003 ] 31.820021901496197 31.821109066958147 -------------------------------------------------------------------------------- checking N2_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.414099235534668 x = [-2.352 3.752 3.41 -1.619 0.1747 2.104 -3.61 -1.052 -3.463 -0.922 -1.726 0.62 -5.367 1.072 -3.166 1.211 -1.96 -1.0625 3.154 -4.66 0.448 0.84 -0.0999 -2.94 -1.6455 6.105 -2.99 4.258 5.26 1.374 -4.055 1.824 -0.11176 -1.866 -3.775 1.285 -0.1925 -0.1384 -0.6284 -1.001 2.912 3.307 -3.32 0.8584 -2.49 0.9956 -0.7773 -1.516 -0.568 0.3015 0.10913 0.2267 -4.973 -0.512 -1.8545 0.2825 -0.668 -0.712 3.355 3.635 -2.266 0.0631 -2.236 2.412 4.062 -2.377 0.744 4.707 -2.271 1.223 0.3496 -0.04385 3.531 2.795 5.2 -0.010254 3.254 -1.81 1.82 1.572 -2.045 -2.54 3.469 1.146 1.703 -0.4585 -1.824 -4.453 1.833 0.6475 2.004 -2.385 -0.5156 1.83 -0.506 2.242 -2.422 1.848 1.573 2.283 -1.89 -0.8257 0.81 4.266 -0.3967 2.951 2.115 9.56 -3.334 -2.348 -1.415 -2.738 -0.51 -2.879 1.991 -1.482 0.2302 -1.974 -2.223 2.121 3.973 -2.822 -0.6943 -4.37 -3.457 -1.298 2.096 0.8184 -0.6187 4.54 2.438 -0.09436 -0.03705 4.555 0.2231 -0.6743 -3.152 2.059 -0.946 -2.422 -0.2578 -1.52 2.678 0.548 3.545 1.315 1.537 -2.174 -4.69 1.103 -0.1678 -3.701 -0.498 -1.625 -3.102 0.705 -2.748 -3.672 -3.146 4.055 ] y = [-2.354 3.74 3.406 -1.61 0.1691 2.094 -3.611 -1.05 -3.441 -0.914 -1.726 0.6133 -5.367 1.064 -3.164 1.213 -1.963 -1.063 3.168 -4.656 0.4478 0.8467 -0.0926 -2.943 -1.646 6.1 -2.975 4.266 5.258 1.364 -4.055 1.826 -0.1107 -1.864 -3.79 1.284 -0.1918 -0.1372 -0.623 -1. 2.906 3.316 -3.332 0.86 -2.484 0.9917 -0.7866 -1.524 -0.564 0.3044 0.1141 0.229 -4.97 -0.521 -1.863 0.283 -0.662 -0.716 3.354 3.635 -2.27 0.0624 -2.232 2.416 4.08 -2.385 0.745 4.727 -2.262 1.219 0.3396 -0.03235 3.518 2.8 5.21 -0.01775 3.248 -1.815 1.82 1.562 -2.043 -2.535 3.47 1.157 1.682 -0.453 -1.817 -4.465 1.827 0.6426 1.999 -2.373 -0.511 1.828 -0.4946 2.238 -2.432 1.85 1.577 2.28 -1.893 -0.8335 0.8047 4.266 -0.399 2.965 2.111 9.58 -3.344 -2.34 -1.44 -2.736 -0.497 -2.873 1.997 -1.483 0.2267 -1.987 -2.227 2.117 3.969 -2.82 -0.6797 -4.37 -3.45 -1.29 2.096 0.807 -0.613 4.535 2.432 -0.0909 -0.03284 4.562 0.2329 -0.6763 -3.145 2.06 -0.9395 -2.408 -0.271 -1.528 2.668 0.5464 3.533 1.313 1.547 -2.188 -4.684 1.101 -0.1674 -3.686 -0.4907 -1.618 -3.096 0.705 -2.756 -3.678 -3.158 4.055 ] 32.48386981779379 32.48606236297071 -------------------------------------------------------------------------------- checking O_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.40739505767822265 x = [-0.9644 -2.37 -0.8105 1.788 0.502 1.131 -1.181 -4.035 -3.47 -2.791 -1.566 2.654 1.033 -3.459 2.387 -1.5 3.57 -2.242 1.172 -0.4211 0.4092 3.41 -1.988 -0.359 2.336 3.807 -1.868 6.258 5.543 1.664 -4.05 -0.9863 -2.557 -3.36 -1.108 -3.16 1.238 -3.65 2.59 1.072 -1.673 3.89 3.361 2.654 -4.562 -1.464 -0.5386 4.074 0.3896 -0.4438 1.74 -2.645 0.9395 1.21 -1.012 -2.36 -0.4775 -0.3086 -1.508 -1.703 2.344 0.671 0.6816 3.271 -0.2642 -3.709 -0.8687 -0.3982 -1.705 4.59 -0.852 -0.9443 1.145 -2.174 0.9 -0.7256 -0.502 1.3545 2.098 3.748 0.6694 -2.533 -4.246 1.987 1.309 -0.01651 2.959 -4.168 2.393 -3.902 0.3357 -1.32 -1.339 -0.528 -1.603 -0.686 3.635 2.293 -2.65 3.408 1.198 0.833 0.8755 -2.879 0.5757 0.8423 -0.2169 0.5923 5.406 -2.229 2.664 -0.9473 -0.4773 -2.943 -0.0863 -0.03137 1.776 3.7 0.1365 3.133 -1.982 1.192 -0.5244 -0.22 -0.579 -0.4375 -0.9937 0.2117 0.896 -5.684 -5.59 4.695 -1.871 -1.567 -2.826 4.84 -0.312 0.5415 4.5 -2.955 -5.61 -4.52 1.416 1.297 -3.629 2.453 2.18 2.143 -2.404 0.475 -0.8994 -1.184 -1.509 -3.13 -3. 4.63 0.489 4.94 2.729 0.977 ] y = [-9.692e-01 -2.361e+00 -8.047e-01 1.784e+00 5.049e-01 1.131e+00 -1.186e+00 -4.039e+00 -3.477e+00 -2.797e+00 -1.559e+00 2.650e+00 1.037e+00 -3.451e+00 2.383e+00 -1.501e+00 3.576e+00 -2.242e+00 1.172e+00 -4.182e-01 4.001e-01 3.412e+00 -1.991e+00 -3.525e-01 2.340e+00 3.812e+00 -1.863e+00 6.262e+00 5.543e+00 1.665e+00 -4.051e+00 -9.824e-01 -2.557e+00 -3.357e+00 -1.105e+00 -3.168e+00 1.230e+00 -3.645e+00 2.590e+00 1.068e+00 -1.668e+00 3.887e+00 3.352e+00 2.654e+00 -4.562e+00 -1.467e+00 -5.308e-01 4.070e+00 3.955e-01 -4.495e-01 1.736e+00 -2.633e+00 9.458e-01 1.215e+00 -1.022e+00 -2.355e+00 -4.766e-01 -3.127e-01 -1.509e+00 -1.700e+00 2.346e+00 6.719e-01 6.807e-01 3.270e+00 -2.610e-01 -3.703e+00 -8.687e-01 -3.901e-01 -1.706e+00 4.582e+00 -8.472e-01 -9.458e-01 1.152e+00 -2.176e+00 9.028e-01 -7.275e-01 -5.107e-01 1.345e+00 2.096e+00 3.750e+00 6.704e-01 -2.541e+00 -4.246e+00 1.988e+00 1.313e+00 -4.562e-03 2.955e+00 -4.164e+00 2.396e+00 -3.898e+00 3.438e-01 -1.316e+00 -1.337e+00 -5.327e-01 -1.601e+00 -6.929e-01 3.643e+00 2.295e+00 -2.650e+00 3.406e+00 1.197e+00 8.354e-01 8.799e-01 -2.881e+00 5.811e-01 8.369e-01 -2.061e-01 5.942e-01 5.414e+00 -2.229e+00 2.668e+00 -9.482e-01 -4.810e-01 -2.949e+00 -8.514e-02 -3.537e-02 1.766e+00 3.691e+00 1.364e-01 3.123e+00 -1.986e+00 1.181e+00 -5.195e-01 -2.329e-01 -5.771e-01 -4.341e-01 -9.902e-01 2.125e-01 8.945e-01 -5.680e+00 -5.598e+00 4.699e+00 -1.874e+00 -1.572e+00 -2.820e+00 4.832e+00 -3.174e-01 5.449e-01 4.492e+00 -2.957e+00 -5.605e+00 -4.508e+00 1.415e+00 1.297e+00 -3.629e+00 2.459e+00 2.182e+00 2.133e+00 -2.402e+00 4.758e-01 -9.028e-01 -1.184e+00 -1.507e+00 -3.125e+00 -3.000e+00 4.629e+00 4.841e-01 4.941e+00 2.732e+00 9.727e-01] 31.73298875258407 31.72575066271583 -------------------------------------------------------------------------------- checking O_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.027653335798997432 x = [ 0.0861 -0.2449 0.1599 ... -0.2551 -0.3354 -0.0451] y = [ 0.0859 -0.2426 0.1594 ... -0.2527 -0.3284 -0.04343] 28.059971487544296 28.064574450048895 -------------------------------------------------------------------------------- checking V_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.09884565591812133 x = [ 1.514 0.237 0.968 0.501 0.0782 0.692 -0.4612 0.408 -0.10657 -0.3948 0.816 -0.315 0.545 -0.2148 -0.663 0.01686 -0.514 0.952 -0.3718 -0.74 -0.3325 -0.8696 0.6533 0.748 -0.1526 -0.431 -0.5767 0.2427 -0.1475 1.269 -0.16 1.11 -0.1119 -0.3843 0.6475 -0.1759 0.3604 -1.01 1.643 0.2244 -0.1322 0.0688 0.262 0.3208 -0.702 -0.4556 0.00636 -0.596 0.04663 -0.3245 -0.4465 0.2307 -0.3635 -0.9653 0.5386 0.0494 0.601 0.4197 -0.4966 0.27 0.11835 -0.4524 -1.916 -0.55 -0.92 0.3115 0.3882 -0.3442 -0.992 0.354 0.904 0.676 0.452 -0.08496 1.263 0.1987 -1.265 0.05615 0.1632 0.909 0.4016 0.5625 0.905 0.702 0.3691 -0.8677 0.00992 0.0927 0.0659 -0.6055 -0.6187 -0.3477 -0.3018 0.0396 -0.3848 0.5796 0.4019 0.2573 -0.6187 0.7227 -0.297 -0.9023 -0.1199 0.459 -0.6016 -1.351 0.4841 -0.607 1.281 -0.924 -0.3247 0.02698 -0.743 -0.3882 -0.968 0.2454 0.328 -0.7896 0.51 -0.919 0.1152 0.137 -0.5347 0.4485 -0.107 -0.154 0.6694 -0.1311 0.1302 -1.478 -1.002 0.1993 -0.624 -0.1003 -0.3003 -0.3208 -0.2732 0.1265 0.1581 0.2585 -0.406 -1.992 -0.8193 -0.01372 -0.2103 -0.7554 0.4053 0.1871 -0.2335 0.02785 0.7964 -0.002157 0.573 0.0702 -0.3538 0.06476 0.2778 -1.019 -0.0927 -1.013 ] y = [ 1.514 0.2369 0.967 0.5005 0.0766 0.691 -0.4658 0.4094 -0.10785 -0.3933 0.815 -0.3152 0.5444 -0.2175 -0.6655 0.0136 -0.512 0.949 -0.3696 -0.7407 -0.3298 -0.8657 0.6562 0.749 -0.1526 -0.4275 -0.5776 0.24 -0.1482 1.271 -0.1582 1.107 -0.1119 -0.386 0.6455 -0.175 0.3591 -1.011 1.643 0.2233 -0.1337 0.0696 0.2625 0.321 -0.704 -0.4563 0.005714 -0.5986 0.0436 -0.3247 -0.4429 0.2317 -0.3613 -0.964 0.5386 0.05066 0.601 0.421 -0.4983 0.2717 0.11774 -0.4514 -1.916 -0.5493 -0.921 0.3108 0.389 -0.3433 -0.9917 0.3516 0.9062 0.6787 0.4521 -0.08545 1.265 0.1997 -1.263 0.05557 0.1636 0.9087 0.4036 0.565 0.9033 0.702 0.3699 -0.8687 0.01096 0.09576 0.0636 -0.604 -0.62 -0.3516 -0.3018 0.03928 -0.3828 0.579 0.405 0.2566 -0.622 0.7183 -0.299 -0.901 -0.12177 0.4592 -0.6006 -1.349 0.4846 -0.6045 1.278 -0.9233 -0.3254 0.02805 -0.74 -0.3882 -0.967 0.2482 0.3257 -0.788 0.509 -0.919 0.11395 0.1384 -0.533 0.448 -0.1067 -0.1532 0.6685 -0.134 0.1311 -1.477 -1.003 0.2052 -0.6274 -0.0991 -0.2979 -0.3235 -0.2737 0.1254 0.1581 0.2578 -0.4048 -1.992 -0.821 -0.01201 -0.2092 -0.7583 0.406 0.1885 -0.2325 0.02519 0.795 -0.003475 0.5723 0.07007 -0.3533 0.0647 0.2773 -1.019 -0.0925 -1.014 ] 7.938319997033882 7.937124810177468 -------------------------------------------------------------------------------- checking V_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.02764177138451487 x = [-0.142 0.2075 -0.02162 ... -0.4077 -0.04352 0.0456 ] y = [-0.1418 0.2039 -0.02083 ... -0.408 -0.04297 0.0446 ] 28.288297762877264 28.305589978205457 -------------------------------------------------------------------------------- checking hidden_state : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.004581627779243718 x = [-0.011734 0.01884 0.03503 ... 0.002705 0.0001593 0.02634 ] y = [-0.0117 0.01883 0.03506 ... 0.002707 0.0001407 0.02634 ] 32.88236819140935 32.882753651699524 -------------------------------------------------------------------------------- checking norm_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4154906463623047 x = [-0.867 -2.11 -0.834 1.793 0.6235 1.206 -1.542 -4.152 -3.398 -2.967 -1.682 2.934 0.6655 -3.549 2.725 -1.744 3.494 -2.414 0.9033 -0.503 0.451 3.654 -1.802 -0.5293 2.51 4.176 -1.803 6.17 5.434 1.592 -3.977 -1.243 -2.154 -3.318 -1.002 -3.406 1.346 -3.64 2.684 1.073 -1.703 3.918 3.5 2.604 -4.984 -1.61 -0.46 3.912 0.5635 -0.2242 1.787 -2.678 1. 1.066 -1.005 -2.555 -0.4111 -0.5723 -1.672 -1.4795 2.387 0.481 0.4329 3.482 -0.2493 -3.828 -0.8735 -0.3389 -1.496 4.785 -0.9917 -0.7993 1.044 -2.107 0.7847 -0.774 -0.587 1.348 2.014 3.824 0.739 -2.787 -4.41 2.074 1.516 0.1842 3.387 -4.316 2.547 -4.3 0.4421 -1.487 -1.563 -0.709 -1.638 -0.914 3.893 2.244 -2.67 3.082 1.031 0.992 0.4697 -2.746 0.5693 0.8667 -0.2314 0.8125 5.6 -2.21 2.873 -1.032 -0.5903 -2.904 -0.0647 -0.1895 1.783 3.78 -0.05786 3.35 -1.889 1.19 -0.6426 -0.4404 -0.0881 -0.544 -0.8726 0.2324 0.6113 -5.81 -5.543 4.95 -1.992 -1.47 -2.736 5.414 -0.2515 0.529 4.598 -3.227 -5.69 -4.31 1.34 1.519 -3.615 2.67 2.309 1.981 -2.44 0.1946 -1.007 -1.184 -1.213 -3.145 -2.918 4.707 0.3923 5.03 3.092 1.146 ] y = [-0.8716 -2.104 -0.8267 1.789 0.626 1.207 -1.548 -4.156 -3.404 -2.975 -1.675 2.93 0.6675 -3.543 2.72 -1.747 3.502 -2.414 0.902 -0.4993 0.442 3.658 -1.807 -0.5215 2.516 4.18 -1.8 6.176 5.438 1.592 -3.979 -1.238 -2.15 -3.316 -1.002 -3.414 1.339 -3.633 2.686 1.068 -1.697 3.912 3.492 2.605 -4.984 -1.614 -0.452 3.908 0.5693 -0.2308 1.784 -2.666 1.007 1.07 -1.017 -2.55 -0.4124 -0.5776 -1.673 -1.476 2.387 0.4832 0.4316 3.48 -0.244 -3.824 -0.872 -0.3303 -1.496 4.78 -0.9854 -0.8003 1.053 -2.111 0.787 -0.778 -0.5977 1.337 2.01 3.828 0.7397 -2.795 -4.414 2.074 1.521 0.1979 3.383 -4.312 2.55 -4.297 0.4517 -1.481 -1.562 -0.7173 -1.637 -0.923 3.902 2.246 -2.672 3.08 1.033 0.994 0.474 -2.75 0.5737 0.8594 -0.22 0.8145 5.61 -2.21 2.877 -1.032 -0.5923 -2.91 -0.06204 -0.1946 1.773 3.773 -0.05737 3.34 -1.893 1.176 -0.6377 -0.4558 -0.085 -0.539 -0.8687 0.2336 0.6094 -5.805 -5.55 4.953 -1.994 -1.475 -2.729 5.406 -0.257 0.5327 4.586 -3.229 -5.69 -4.297 1.339 1.519 -3.615 2.676 2.309 1.972 -2.432 0.1954 -1.009 -1.183 -1.209 -3.139 -2.918 4.71 0.3882 5.035 3.094 1.142 ] 32.47361509376382 32.47103737343579 -------------------------------------------------------------------------------- checking norm_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4140817070007325 x = [-2.268 3.857 3.387 -1.696 0.1992 2.096 -3.535 -1.026 -3.436 -0.8027 -1.728 0.642 -5.336 1.043 -3.217 1.255 -1.944 -1.044 3.135 -4.64 0.3853 0.8467 -0.03558 -2.953 -1.649 6.125 -3.002 4.293 5.227 1.325 -4.004 1.89 -0.0933 -1.78 -3.828 1.318 -0.2512 -0.1443 -0.651 -0.922 2.854 3.314 -3.38 0.8945 -2.352 0.8853 -0.8594 -1.597 -0.63 0.267 0.10046 0.2471 -5. -0.4824 -1.905 0.2208 -0.611 -0.7227 3.34 3.652 -2.3 0.01343 -2.156 2.404 4.152 -2.318 0.733 4.68 -2.254 1.204 0.4155 0.02415 3.42 2.861 5.223 -0.09076 3.258 -1.837 1.844 1.6455 -2.041 -2.479 3.39 1.092 1.71 -0.538 -1.826 -4.496 1.808 0.6436 1.986 -2.4 -0.5015 1.818 -0.516 2.275 -2.379 1.75 1.523 2.312 -1.968 -0.7695 0.9004 4.293 -0.411 3.023 2.068 9.54 -3.357 -2.361 -1.526 -2.707 -0.517 -2.732 1.907 -1.534 0.2603 -1.944 -2.16 2.193 3.898 -2.773 -0.7236 -4.383 -3.44 -1.347 2.152 0.8237 -0.7017 4.535 2.5 -0.1536 -0.07074 4.61 0.2751 -0.5913 -3.096 2.04 -1.021 -2.365 -0.4104 -1.621 2.744 0.576 3.537 1.369 1.621 -2.107 -4.707 1.108 -0.1908 -3.797 -0.5176 -1.509 -3.018 0.666 -2.719 -3.553 -3.148 4.027 ] y = [-2.268 3.842 3.383 -1.687 0.1971 2.086 -3.535 -1.025 -3.416 -0.7964 -1.728 0.636 -5.332 1.035 -3.219 1.258 -1.943 -1.044 3.15 -4.633 0.3833 0.8506 -0.03026 -2.957 -1.651 6.12 -2.99 4.3 5.223 1.313 -4.004 1.894 -0.09247 -1.776 -3.84 1.315 -0.251 -0.1442 -0.6426 -0.92 2.848 3.326 -3.396 0.8926 -2.342 0.8784 -0.8687 -1.605 -0.629 0.2712 0.10706 0.2478 -4.992 -0.4922 -1.913 0.2211 -0.6025 -0.729 3.342 3.652 -2.307 0.01103 -2.152 2.408 4.176 -2.326 0.738 4.7 -2.242 1.201 0.4062 0.03348 3.4 2.865 5.23 -0.09863 3.252 -1.844 1.842 1.636 -2.037 -2.475 3.389 1.101 1.688 -0.5337 -1.819 -4.508 1.801 0.6396 1.981 -2.396 -0.4985 1.817 -0.5034 2.273 -2.39 1.747 1.524 2.31 -1.972 -0.775 0.8936 4.29 -0.4119 3.035 2.062 9.55 -3.367 -2.354 -1.548 -2.709 -0.502 -2.727 1.913 -1.535 0.2588 -1.961 -2.164 2.191 3.895 -2.775 -0.7114 -4.38 -3.432 -1.34 2.148 0.814 -0.6978 4.53 2.494 -0.1544 -0.068 4.613 0.2827 -0.5957 -3.086 2.041 -1.015 -2.35 -0.4243 -1.629 2.734 0.575 3.523 1.365 1.633 -2.121 -4.7 1.106 -0.1885 -3.781 -0.5127 -1.5 -3.018 0.665 -2.729 -3.557 -3.16 4.03 ] 32.4478265293333 32.44531085190211 -------------------------------------------------------------------------------- checking out_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.40807239532470707 x = [-0.84 -2.076 -0.823 1.788 0.588 1.172 -1.522 -4.08 -3.352 -2.916 -1.654 2.857 0.6436 -3.488 2.676 -1.715 3.426 -2.377 0.888 -0.4758 0.4302 3.588 -1.777 -0.5195 2.459 4.11 -1.789 6.055 5.32 1.56 -3.91 -1.219 -2.125 -3.248 -0.986 -3.336 1.333 -3.588 2.633 1.04 -1.679 3.842 3.453 2.553 -4.91 -1.596 -0.4617 3.826 0.5444 -0.2332 1.748 -2.623 0.968 1.033 -0.9985 -2.514 -0.4075 -0.5503 -1.643 -1.448 2.336 0.4658 0.417 3.42 -0.2449 -3.748 -0.8516 -0.3499 -1.463 4.695 -0.9746 -0.7837 1.046 -2.064 0.778 -0.753 -0.5747 1.334 1.972 3.75 0.73 -2.729 -4.336 2.031 1.469 0.1843 3.307 -4.242 2.502 -4.24 0.4146 -1.469 -1.539 -0.689 -1.618 -0.899 3.818 2.193 -2.617 3.037 1.008 0.9673 0.4497 -2.709 0.562 0.8525 -0.234 0.784 5.508 -2.178 2.816 -0.9946 -0.5884 -2.844 -0.0788 -0.188 1.755 3.701 -0.05954 3.297 -1.869 1.171 -0.627 -0.4443 -0.0708 -0.538 -0.8564 0.2316 0.5986 -5.695 -5.457 4.88 -1.96 -1.443 -2.693 5.316 -0.253 0.5044 4.508 -3.168 -5.6 -4.234 1.308 1.491 -3.559 2.611 2.252 1.951 -2.402 0.2019 -0.9976 -1.152 -1.199 -3.113 -2.877 4.645 0.3828 4.965 3.04 1.122 ] y = [-0.8433 -2.074 -0.816 1.783 0.5884 1.172 -1.526 -4.082 -3.361 -2.924 -1.65 2.854 0.648 -3.482 2.668 -1.719 3.432 -2.379 0.8877 -0.4746 0.4216 3.592 -1.785 -0.513 2.463 4.113 -1.786 6.062 5.324 1.56 -3.914 -1.216 -2.125 -3.248 -0.9844 -3.344 1.325 -3.584 2.633 1.035 -1.673 3.836 3.441 2.555 -4.914 -1.6 -0.4543 3.822 0.549 -0.2421 1.745 -2.613 0.9775 1.037 -1.011 -2.512 -0.4087 -0.558 -1.645 -1.444 2.338 0.4648 0.4148 3.418 -0.2421 -3.746 -0.8516 -0.3425 -1.462 4.688 -0.969 -0.7886 1.054 -2.07 0.781 -0.7593 -0.586 1.323 1.968 3.752 0.7324 -2.738 -4.336 2.033 1.477 0.1974 3.305 -4.242 2.506 -4.234 0.4243 -1.464 -1.536 -0.6978 -1.618 -0.91 3.828 2.193 -2.62 3.035 1.009 0.9673 0.4543 -2.713 0.5645 0.844 -0.2229 0.7856 5.51 -2.178 2.816 -0.994 -0.5874 -2.854 -0.07715 -0.1941 1.744 3.695 -0.0606 3.285 -1.875 1.152 -0.623 -0.4592 -0.06726 -0.534 -0.8525 0.234 0.5933 -5.695 -5.465 4.88 -1.964 -1.45 -2.69 5.305 -0.2573 0.5093 4.5 -3.17 -5.605 -4.227 1.304 1.491 -3.559 2.62 2.254 1.94 -2.393 0.2007 -1. -1.154 -1.196 -3.11 -2.879 4.645 0.379 4.965 3.043 1.115 ] 31.903260834580752 31.90386606450421 -------------------------------------------------------------------------------- checking out_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.05313741839956492 x = [ 0.013695 -0.0926 0.0529 ... 0.1973 0.1287 -0.1114 ] y = [ 0.01308 -0.0901 0.05374 ... 0.1965 0.1255 -0.11316] 53.73575919935149 53.73732041540662 -------------------------------------------------------------------------------- checking int_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.0595818042755127 x = [ 0.1865 -0.2834 0.1843 0.1499 0.236 -0.748 -0.02357 0.02089 -0.771 0.724 -0.4294 0.0962 -0.11475 -0.1198 0.3574 -0.134 -0.3372 -0.2161 -0.1315 -0.7334 0.882 -0.001638 -0.1528 -0.273 -0.1875 0.2517 -0.4124 -0.2214 0.626 -0.1035 -0.3108 -0.1626 -1.089 -0.366 0.07007 0.4934 0.2422 -0.2695 0.889 -0.478 -0.2993 0.01569 -0.1102 -0.27 0.1088 0.2327 -0.145 0.8057 -0.4626 -0.8037 -0.533 -0.47 -0.3865 0.10284 -0.0692 -0.817 -0.3005 -0.2935 -0.12317 0.1932 -0.08026 -0.2737 -0.27 0.7085 0.2333 0.2443 0.1945 0.04575 -0.1532 -0.603 -0.1572 0.8135 0.0166 0.381 -0.3657 0.2231 -0.5337 0.3926 0.1738 -0.251 0.2393 -0.1646 -0.2075 0.2761 -0.2424 -0.0877 0.1937 0.289 0.2424 -0.4834 0.3406 -0.003702 0.06573 -0.2937 0.1764 -0.08704 -0.2512 -0.02713 -0.437 -0.1908 0.1951 -0.108 0.2169 -0.4531 -0.2433 -0.592 -0.1677 0.2126 0.02911 -0.06027 -0.3787 -0.06775 -0.3784 -0.2162 0.6787 -0.3643 -0.2786 -0.2003 0.2754 -0.8086 -0.06555 -0.381 0.2396 0.2383 -0.0894 -0.572 0.5693 0.2932 -0.03757 -0.3364 -0.475 0.507 0.4453 0.0519 0.367 0.536 -0.33 -0.7446 0.3809 -0.0868 -0.1862 -0.307 0.0876 0.1318 0.0326 -0.777 -0.068 0.1533 0.243 -0.1334 0.2605 -0.008125 0.01422 0.2734 0.2927 0.1575 -0.01111 -0.2325 0.636 0.551 ] y = [ 0.1858 -0.284 0.1837 0.1493 0.2355 -0.749 -0.0248 0.02151 -0.7705 0.7236 -0.429 0.0953 -0.1152 -0.1205 0.3577 -0.1339 -0.336 -0.2191 -0.1317 -0.735 0.8813 -0.002 -0.154 -0.2703 -0.1869 0.2517 -0.414 -0.2195 0.625 -0.10315 -0.3108 -0.1632 -1.089 -0.3652 0.0692 0.4946 0.244 -0.269 0.8906 -0.4775 -0.3005 0.015526 -0.1129 -0.2732 0.1088 0.2334 -0.1443 0.805 -0.4626 -0.803 -0.5337 -0.4707 -0.3865 0.102 -0.0703 -0.8193 -0.3005 -0.2942 -0.1241 0.1913 -0.07947 -0.274 -0.2688 0.7095 0.2334 0.2446 0.1937 0.0476 -0.153 -0.603 -0.1584 0.816 0.0166 0.3806 -0.366 0.2218 -0.533 0.3923 0.174 -0.2527 0.2395 -0.1653 -0.2075 0.2761 -0.2412 -0.08624 0.1945 0.2898 0.2407 -0.4844 0.341 -0.00549 0.06604 -0.2937 0.1774 -0.0872 -0.2524 -0.02736 -0.438 -0.1891 0.1959 -0.107 0.2179 -0.452 -0.244 -0.5933 -0.1705 0.2113 0.03065 -0.06122 -0.38 -0.0666 -0.3765 -0.2162 0.678 -0.3645 -0.2795 -0.2002 0.276 -0.8096 -0.0642 -0.3809 0.2402 0.2393 -0.08844 -0.5728 0.569 0.2922 -0.03946 -0.3357 -0.4739 0.5063 0.4456 0.05334 0.3674 0.537 -0.3289 -0.7446 0.381 -0.08704 -0.1869 -0.3057 0.08746 0.1317 0.03418 -0.7754 -0.0678 0.1527 0.2438 -0.1332 0.26 -0.009445 0.01376 0.2725 0.2935 0.1558 -0.010666 -0.2341 0.634 0.5503 ] 4.713868930630692 4.71514781418146 -------------------------------------------------------------------------------- checking int_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.05587906054686756 x = [ 0.7803 -0.1953 -0.8364 ... 0.421 -0.2261 0.501 ] y = [ 0.771 -0.1969 -0.8115 ... 0.4216 -0.2245 0.4956] 56.197978430831846 56.19122480223955 -------------------------------------------------------------------------------- checking N2_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4084033620357514 x = [-7.9346e-01 -1.8711e+00 -7.7051e-01 1.7754e+00 6.1963e-01 1.1846e+00 -1.5674e+00 -4.1992e+00 -3.2520e+00 -2.8516e+00 -1.5508e+00 2.8984e+00 6.6797e-01 -3.4473e+00 2.5664e+00 -1.7031e+00 3.4648e+00 -2.1875e+00 9.3408e-01 -6.4355e-01 4.9902e-01 3.6816e+00 -1.7451e+00 -6.4209e-01 2.5488e+00 4.2148e+00 -1.4805e+00 6.1875e+00 5.3398e+00 1.4570e+00 -3.9102e+00 -1.3682e+00 -1.9932e+00 -3.2188e+00 -1.0039e+00 -3.2422e+00 1.2734e+00 -3.6465e+00 2.4727e+00 1.0967e+00 -1.6523e+00 3.9961e+00 3.4980e+00 2.6016e+00 -4.8438e+00 -1.5078e+00 -4.0967e-01 3.9062e+00 5.0830e-01 -7.7148e-02 1.8955e+00 -2.6543e+00 1.1650e+00 1.0947e+00 -9.2969e-01 -2.4824e+00 -2.6123e-01 -3.9722e-01 -1.5830e+00 -1.3154e+00 2.4102e+00 5.4102e-01 5.5762e-01 3.5000e+00 -4.8730e-01 -3.7676e+00 -9.1455e-01 -1.9666e-01 -1.4463e+00 4.7266e+00 -9.9316e-01 -9.1553e-01 1.0195e+00 -2.0410e+00 8.0908e-01 -6.7822e-01 -4.9829e-01 1.4287e+00 2.1270e+00 3.7559e+00 7.0020e-01 -2.7168e+00 -4.3281e+00 2.1172e+00 1.5068e+00 6.9336e-02 3.2109e+00 -4.0781e+00 2.5469e+00 -4.3320e+00 4.6436e-01 -1.3164e+00 -1.5674e+00 -4.6118e-01 -1.6230e+00 -8.2324e-01 3.9355e+00 2.1543e+00 -2.8145e+00 3.1797e+00 8.8525e-01 1.0039e+00 3.0884e-01 -2.8242e+00 4.5142e-01 9.8926e-01 -2.1277e-01 5.4297e-01 5.3945e+00 -2.1621e+00 2.9336e+00 -1.0371e+00 -6.0889e-01 -2.8281e+00 -8.2550e-03 -1.5942e-01 1.8359e+00 3.6426e+00 9.3365e-04 3.3926e+00 -1.7812e+00 1.2041e+00 -8.1934e-01 -4.4556e-01 -1.4832e-01 -4.2847e-01 -9.1943e-01 2.5952e-01 6.1621e-01 -5.7383e+00 -5.3555e+00 4.7930e+00 -1.9385e+00 -1.4316e+00 -2.8281e+00 5.2773e+00 -2.7441e-01 5.2393e-01 4.5703e+00 -3.2461e+00 -5.6367e+00 -4.3984e+00 1.3848e+00 1.4326e+00 -3.5488e+00 2.6680e+00 2.2324e+00 1.9189e+00 -2.5195e+00 2.1558e-01 -9.7852e-01 -1.2939e+00 -1.2578e+00 -3.1211e+00 -2.6816e+00 4.7227e+00 3.8574e-01 4.8633e+00 3.0684e+00 1.0615e+00] y = [-7.9688e-01 -1.8682e+00 -7.6270e-01 1.7715e+00 6.2012e-01 1.1846e+00 -1.5713e+00 -4.2070e+00 -3.2637e+00 -2.8594e+00 -1.5479e+00 2.8945e+00 6.7285e-01 -3.4414e+00 2.5586e+00 -1.7070e+00 3.4688e+00 -2.1895e+00 9.3213e-01 -6.4209e-01 4.9023e-01 3.6855e+00 -1.7539e+00 -6.3721e-01 2.5527e+00 4.2188e+00 -1.4775e+00 6.1953e+00 5.3398e+00 1.4561e+00 -3.9141e+00 -1.3662e+00 -1.9922e+00 -3.2188e+00 -1.0039e+00 -3.2500e+00 1.2656e+00 -3.6426e+00 2.4727e+00 1.0908e+00 -1.6455e+00 3.9922e+00 3.4863e+00 2.6035e+00 -4.8477e+00 -1.5127e+00 -4.0161e-01 3.9023e+00 5.1270e-01 -8.5083e-02 1.8936e+00 -2.6445e+00 1.1738e+00 1.0986e+00 -9.4238e-01 -2.4805e+00 -2.6221e-01 -4.0527e-01 -1.5850e+00 -1.3115e+00 2.4102e+00 5.4053e-01 5.5518e-01 3.4980e+00 -4.8535e-01 -3.7656e+00 -9.1553e-01 -1.8848e-01 -1.4463e+00 4.7188e+00 -9.8779e-01 -9.2090e-01 1.0273e+00 -2.0469e+00 8.1152e-01 -6.8506e-01 -5.0977e-01 1.4180e+00 2.1230e+00 3.7559e+00 7.0312e-01 -2.7285e+00 -4.3281e+00 2.1211e+00 1.5156e+00 8.2581e-02 3.2109e+00 -4.0820e+00 2.5508e+00 -4.3320e+00 4.7461e-01 -1.3125e+00 -1.5635e+00 -4.6948e-01 -1.6230e+00 -8.3447e-01 3.9453e+00 2.1543e+00 -2.8164e+00 3.1777e+00 8.8623e-01 1.0049e+00 3.1348e-01 -2.8281e+00 4.5459e-01 9.8047e-01 -2.0081e-01 5.4492e-01 5.3945e+00 -2.1621e+00 2.9355e+00 -1.0361e+00 -6.0791e-01 -2.8379e+00 -7.2098e-03 -1.6602e-01 1.8242e+00 3.6367e+00 4.1938e-04 3.3828e+00 -1.7871e+00 1.1846e+00 -8.1592e-01 -4.6118e-01 -1.4490e-01 -4.2480e-01 -9.1553e-01 2.6196e-01 6.1084e-01 -5.7383e+00 -5.3672e+00 4.7930e+00 -1.9414e+00 -1.4395e+00 -2.8242e+00 5.2656e+00 -2.7954e-01 5.2979e-01 4.5625e+00 -3.2480e+00 -5.6367e+00 -4.3867e+00 1.3809e+00 1.4326e+00 -3.5508e+00 2.6758e+00 2.2344e+00 1.9082e+00 -2.5117e+00 2.1399e-01 -9.8145e-01 -1.2949e+00 -1.2549e+00 -3.1172e+00 -2.6836e+00 4.7227e+00 3.8281e-01 4.8633e+00 3.0703e+00 1.0537e+00] 32.00020962015037 32.000991126175215 -------------------------------------------------------------------------------- checking N2_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.41297767639160154 x = [-2.242 3.607 3.334 -1.623 0.3376 2.209 -3.406 -0.871 -3.484 -0.754 -1.73 0.5913 -5.344 0.8306 -3.002 1.382 -2.037 -1.02 2.979 -4.66 0.3992 1.078 0.07513 -2.818 -1.809 6.113 -2.893 4.02 4.992 1.308 -4.01 1.803 -0.1417 -1.699 -3.826 1.48 -0.348 -0.1755 -0.8467 -1.119 2.973 3.334 -3.428 0.9346 -2.26 0.899 -0.9805 -1.813 -0.667 0.3342 0.2201 0.3303 -5.05 -0.585 -1.735 0.287 -0.699 -0.7896 3.207 3.754 -1.923 0.3245 -2.275 2.271 4.35 -2.295 0.666 4.516 -2.178 1.17 0.3772 -0.03204 3.412 2.88 5.21 -0.1929 3.268 -1.721 1.715 1.555 -2.17 -2.75 3.283 1.132 1.756 -0.6846 -1.769 -4.375 1.816 0.6055 1.986 -2.254 -0.611 1.754 -0.5244 2.572 -2.516 1.637 1.485 2.32 -1.968 -0.7207 0.913 4.34 -0.342 3.045 2.145 9.336 -3.479 -2.184 -1.427 -2.752 -0.506 -2.473 1.878 -1.712 0.1547 -1.82 -2.201 2.041 3.846 -2.754 -0.734 -4.367 -3.33 -1.234 2.242 0.807 -0.9297 4.6 2.34 -0.188 -0.04974 4.63 0.1938 -0.4863 -3.078 2.219 -0.869 -2.229 -0.4119 -1.726 2.934 0.4304 3.441 1.315 1.525 -2.174 -4.754 1.234 -0.2788 -3.941 -0.4995 -1.652 -2.658 0.602 -2.822 -3.55 -2.982 3.871 ] y = [-2.242 3.594 3.326 -1.613 0.336 2.2 -3.404 -0.8706 -3.463 -0.7466 -1.732 0.586 -5.344 0.8237 -3.002 1.385 -2.033 -1.017 2.996 -4.652 0.3948 1.081 0.07776 -2.82 -1.8125 6.11 -2.88 4.027 4.99 1.297 -4.016 1.809 -0.1389 -1.696 -3.838 1.4795 -0.3455 -0.1721 -0.8413 -1.116 2.969 3.344 -3.445 0.9365 -2.248 0.8926 -0.99 -1.823 -0.6636 0.336 0.2225 0.3289 -5.047 -0.593 -1.74 0.2888 -0.6914 -0.794 3.209 3.756 -1.929 0.3218 -2.268 2.277 4.37 -2.303 0.6714 4.535 -2.17 1.166 0.368 -0.02411 3.396 2.885 5.223 -0.1997 3.266 -1.728 1.711 1.546 -2.164 -2.746 3.283 1.144 1.739 -0.679 -1.764 -4.387 1.811 0.5996 1.982 -2.248 -0.609 1.753 -0.5137 2.57 -2.53 1.635 1.486 2.32 -1.971 -0.725 0.9033 4.34 -0.347 3.059 2.14 9.35 -3.486 -2.176 -1.447 -2.754 -0.4954 -2.469 1.882 -1.713 0.1498 -1.835 -2.201 2.04 3.838 -2.754 -0.7217 -4.363 -3.324 -1.228 2.24 0.7983 -0.927 4.598 2.334 -0.1833 -0.0464 4.633 0.2023 -0.4937 -3.07 2.223 -0.864 -2.21 -0.4238 -1.735 2.926 0.4292 3.43 1.315 1.536 -2.186 -4.746 1.238 -0.277 -3.924 -0.4944 -1.6455 -2.654 0.601 -2.834 -3.555 -2.992 3.875 ] 32.19186730842149 32.19502961739838 -------------------------------------------------------------------------------- checking O_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4077250289916992 x = [-0.8135 -1.889 -0.79 1.753 0.6055 1.17 -1.58 -4.215 -3.268 -2.861 -1.564 2.871 0.645 -3.455 2.533 -1.724 3.432 -2.201 0.916 -0.662 0.4797 3.656 -1.769 -0.653 2.518 4.176 -1.495 6.15 5.31 1.438 -3.92 -1.386 -2.012 -3.236 -1.014 -3.25 1.257 -3.656 2.45 1.073 -1.665 3.975 3.46 2.576 -4.848 -1.525 -0.4307 3.88 0.4895 -0.1028 1.885 -2.676 1.152 1.078 -0.942 -2.496 -0.2732 -0.4177 -1.595 -1.329 2.383 0.5205 0.5376 3.479 -0.5054 -3.777 -0.929 -0.2109 -1.465 4.7 -1.01 -0.934 0.993 -2.053 0.784 -0.7036 -0.5215 1.405 2.098 3.72 0.6816 -2.732 -4.33 2.09 1.487 0.05136 3.178 -4.09 2.516 -4.344 0.451 -1.334 -1.579 -0.487 -1.641 -0.8394 3.906 2.135 -2.834 3.154 0.8643 0.9824 0.29 -2.838 0.433 0.97 -0.2325 0.5244 5.36 -2.18 2.896 -1.055 -0.63 -2.852 -0.03452 -0.1818 1.81 3.613 -0.01639 3.373 -1.788 1.176 -0.835 -0.4595 -0.1736 -0.4417 -0.932 0.2416 0.598 -5.742 -5.367 4.766 -1.953 -1.444 -2.838 5.24 -0.293 0.4998 4.543 -3.254 -5.64 -4.402 1.368 1.417 -3.555 2.646 2.209 1.895 -2.525 0.199 -0.991 -1.317 -1.272 -3.135 -2.684 4.703 0.3708 4.83 3.043 1.033 ] y = [-0.8174 -1.885 -0.7812 1.752 0.6064 1.167 -1.586 -4.215 -3.28 -2.867 -1.56 2.863 0.646 -3.451 2.527 -1.727 3.438 -2.203 0.9146 -0.657 0.4697 3.66 -1.775 -0.6484 2.521 4.18 -1.489 6.152 5.312 1.4375 -3.922 -1.382 -2.008 -3.232 -1.014 -3.258 1.247 -3.648 2.451 1.07 -1.656 3.965 3.451 2.578 -4.848 -1.53 -0.4216 3.875 0.4941 -0.1086 1.883 -2.66 1.159 1.081 -0.9546 -2.49 -0.2766 -0.4236 -1.594 -1.325 2.385 0.5205 0.5356 3.477 -0.504 -3.773 -0.929 -0.2028 -1.463 4.69 -1.004 -0.9346 0.9995 -2.057 0.787 -0.7085 -0.5317 1.395 2.096 3.723 0.6846 -2.742 -4.33 2.092 1.495 0.06744 3.18 -4.09 2.518 -4.34 0.4624 -1.327 -1.572 -0.4922 -1.642 -0.8516 3.916 2.135 -2.838 3.154 0.867 0.984 0.2954 -2.842 0.4382 0.9634 -0.2175 0.5273 5.367 -2.178 2.898 -1.056 -0.6265 -2.86 -0.03363 -0.1854 1.799 3.605 -0.0172 3.361 -1.79 1.155 -0.8325 -0.4746 -0.1698 -0.4365 -0.9272 0.2452 0.5938 -5.742 -5.375 4.766 -1.954 -1.449 -2.832 5.227 -0.297 0.506 4.535 -3.252 -5.64 -4.39 1.366 1.417 -3.553 2.656 2.207 1.884 -2.516 0.2008 -0.993 -1.317 -1.27 -3.127 -2.684 4.703 0.3687 4.83 3.049 1.025 ] 31.920893038358486 31.91338059889018 -------------------------------------------------------------------------------- checking O_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.02665174110652879 x = [ 0.0879 -0.2634 0.1681 ... -0.2334 -0.346 -0.02592] y = [ 0.08997 -0.2612 0.1688 ... -0.2334 -0.3386 -0.02393] 27.04356087453492 27.041891114108015 -------------------------------------------------------------------------------- checking V_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.10045158386230468 x = [ 1.475 0.1869 1.071 0.439 0.02496 0.704 -0.4177 0.501 -0.1305 -0.4373 0.8604 -0.3318 0.557 -0.189 -0.679 0.0569 -0.623 0.9185 -0.4238 -0.7383 -0.2617 -0.7993 0.651 0.73 -0.05994 -0.4102 -0.6196 0.2505 -0.0837 1.301 -0.158 1.062 -0.0558 -0.4062 0.649 -0.1672 0.361 -1.033 1.679 0.1406 -0.1345 0.141 0.315 0.3965 -0.7246 -0.472 -0.01642 -0.585 0.0545 -0.3687 -0.4395 0.1761 -0.3506 -1.029 0.5356 0.1185 0.582 0.3977 -0.5093 0.298 0.1995 -0.5034 -1.964 -0.5767 -0.9434 0.3171 0.4233 -0.3254 -1.032 0.3496 0.98 0.611 0.4128 -0.041 1.306 0.2445 -1.299 0.07733 0.1555 0.9414 0.4678 0.6924 0.925 0.751 0.3682 -0.8174 0.0498 0.1167 0.05795 -0.637 -0.6436 -0.3606 -0.305 0.1366 -0.4758 0.579 0.3713 0.2986 -0.584 0.712 -0.2146 -0.901 -0.1637 0.516 -0.655 -1.248 0.422 -0.6484 1.338 -0.908 -0.319 -0.1078 -0.747 -0.2925 -0.933 0.2126 0.2778 -0.8 0.4575 -0.9414 0.128 0.10565 -0.5234 0.5176 -0.1345 -0.168 0.6685 -0.1992 0.0915 -1.4795 -0.9644 0.2678 -0.668 -0.1345 -0.308 -0.3213 -0.2322 -0.00562 0.2637 0.233 -0.3518 -2.152 -0.9033 -0.00687 -0.1504 -0.6426 0.454 0.169 -0.2441 0.07605 0.8345 0.008156 0.5923 0.04556 -0.335 -0.02045 0.419 -1.002 -0.02388 -1.072 ] y = [ 1.478 0.1868 1.068 0.4382 0.02295 0.704 -0.4197 0.5015 -0.1334 -0.435 0.8633 -0.3323 0.555 -0.1926 -0.681 0.055 -0.622 0.916 -0.4207 -0.7397 -0.259 -0.7983 0.6523 0.7295 -0.06146 -0.408 -0.621 0.2491 -0.0854 1.302 -0.1578 1.06 -0.05496 -0.4058 0.6504 -0.1665 0.3584 -1.032 1.68 0.1376 -0.1339 0.1377 0.315 0.3953 -0.7266 -0.4714 -0.0148 -0.5854 0.0539 -0.3684 -0.4358 0.1786 -0.3496 -1.027 0.535 0.11884 0.583 0.4014 -0.511 0.3008 0.1986 -0.5034 -1.964 -0.58 -0.944 0.3174 0.4236 -0.3247 -1.029 0.347 0.9824 0.613 0.4133 -0.0406 1.306 0.247 -1.297 0.07733 0.1525 0.9414 0.4697 0.6963 0.922 0.7505 0.371 -0.816 0.04944 0.1206 0.05643 -0.6377 -0.6465 -0.3618 -0.307 0.138 -0.475 0.58 0.3738 0.297 -0.5864 0.708 -0.2169 -0.9014 -0.1649 0.5156 -0.655 -1.25 0.426 -0.6465 1.335 -0.908 -0.316 -0.1062 -0.7456 -0.2935 -0.932 0.2151 0.2766 -0.8 0.4543 -0.943 0.1273 0.107 -0.525 0.5195 -0.1315 -0.1653 0.667 -0.1992 0.0896 -1.4795 -0.9663 0.2705 -0.6655 -0.132 -0.31 -0.3245 -0.232 -0.00792 0.2654 0.2329 -0.3508 -2.152 -0.9067 -0.005814 -0.15 -0.6445 0.4517 0.1692 -0.2423 0.075 0.834 0.00874 0.591 0.04443 -0.333 -0.01988 0.4197 -1. -0.02408 -1.072 ] 8.08672852511747 8.086925146638654 -------------------------------------------------------------------------------- checking V_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.026712438351009046 x = [-0.1066 0.194 -0.03214 ... -0.4004 -0.051 0.0311 ] y = [-0.1075 0.1951 -0.02904 ... -0.401 -0.0524 0.0313 ] 27.340230045625518 27.353107125831908 -------------------------------------------------------------------------------- checking hidden_state : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.004514702800261148 x = [-0.0113 0.01985 0.03482 ... 0.00296 0.0004764 0.02498 ] y = [-0.01126 0.01988 0.03485 ... 0.002962 0.0004592 0.02502 ] 32.404937946367916 32.40559308415013 -------------------------------------------------------------------------------- checking norm_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.41812374114990236 x = [-0.723 -1.619 -0.799 1.751 0.738 1.2295 -1.9375 -4.324 -3.207 -3.04 -1.689 3.166 0.2578 -3.557 2.898 -1.961 3.37 -2.361 0.6553 -0.741 0.53 3.904 -1.573 -0.8447 2.697 4.562 -1.407 6.05 5.195 1.375 -3.846 -1.672 -1.594 -3.184 -0.9023 -3.494 1.378 -3.627 2.54 1.06 -1.681 4.02 3.613 2.521 -5.258 -1.671 -0.3389 3.715 0.6646 0.1298 1.926 -2.707 1.205 0.9414 -0.922 -2.701 -0.1814 -0.7 -1.761 -1.128 2.418 0.3135 0.2798 3.684 -0.4963 -3.918 -0.9355 -0.1588 -1.219 4.914 -1.16 -0.7993 0.8975 -1.993 0.647 -0.75 -0.583 1.401 2.014 3.791 0.752 -2.986 -4.52 2.201 1.685 0.2595 3.611 -4.258 2.693 -4.758 0.579 -1.527 -1.808 -0.658 -1.667 -1.074 4.168 2.064 -2.852 2.787 0.7134 1.166 -0.1395 -2.703 0.4175 0.993 -0.2389 0.7495 5.55 -2.184 3.096 -1.142 -0.7134 -2.795 -0.02322 -0.338 1.801 3.676 -0.2244 3.584 -1.706 1.188 -0.9604 -0.6763 0.3337 -0.5625 -0.8 0.2727 0.2979 -5.88 -5.316 5.027 -2.062 -1.361 -2.764 5.832 -0.2322 0.502 4.65 -3.518 -5.734 -4.19 1.303 1.645 -3.54 2.902 2.328 1.736 -2.555 -0.06903 -1.101 -1.33 -0.997 -3.129 -2.607 4.777 0.278 4.94 3.422 1.203 ] y = [-0.7256 -1.613 -0.789 1.75 0.739 1.227 -1.945 -4.324 -3.219 -3.047 -1.684 3.16 0.258 -3.553 2.893 -1.963 3.375 -2.363 0.6533 -0.7373 0.5215 3.908 -1.583 -0.8403 2.7 4.566 -1.403 6.055 5.2 1.373 -3.85 -1.667 -1.59 -3.182 -0.903 -3.502 1.369 -3.617 2.541 1.059 -1.671 4.008 3.604 2.523 -5.26 -1.679 -0.3306 3.709 0.6675 0.1232 1.926 -2.691 1.214 0.944 -0.935 -2.693 -0.1863 -0.7065 -1.761 -1.125 2.418 0.3123 0.28 3.684 -0.495 -3.916 -0.935 -0.1495 -1.217 4.906 -1.154 -0.7993 0.9062 -1.997 0.6494 -0.7554 -0.593 1.391 2.012 3.795 0.757 -2.996 -4.52 2.201 1.693 0.277 3.615 -4.258 2.697 -4.754 0.5913 -1.52 -1.802 -0.6646 -1.669 -1.089 4.18 2.064 -2.857 2.79 0.718 1.167 -0.1354 -2.707 0.4229 0.9854 -0.2244 0.753 5.555 -2.182 3.098 -1.143 -0.7095 -2.8 -0.02211 -0.3423 1.789 3.668 -0.2245 3.574 -1.706 1.166 -0.9575 -0.693 0.3386 -0.557 -0.794 0.276 0.2937 -5.88 -5.324 5.027 -2.062 -1.366 -2.758 5.82 -0.2367 0.509 4.637 -3.514 -5.734 -4.18 1.3 1.645 -3.537 2.908 2.326 1.728 -2.543 -0.0688 -1.101 -1.329 -0.9927 -3.121 -2.605 4.777 0.2778 4.94 3.43 1.196 ] 32.826708833748896 32.82044679431405 -------------------------------------------------------------------------------- checking norm_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.413134765625 x = [-2.154 3.709 3.303 -1.694 0.3635 2.193 -3.324 -0.851 -3.453 -0.647 -1.735 0.6084 -5.305 0.809 -3.062 1.428 -2.016 -1.011 2.973 -4.633 0.34 1.083 0.1313 -2.832 -1.817 6.137 -2.904 4.06 4.95 1.263 -3.96 1.867 -0.1315 -1.612 -3.88 1.508 -0.4011 -0.1786 -0.874 -1.045 2.908 3.344 -3.482 0.973 -2.125 0.794 -1.061 -1.895 -0.7324 0.3018 0.2078 0.3474 -5.062 -0.5576 -1.783 0.2339 -0.6484 -0.805 3.19 3.762 -1.961 0.2773 -2.195 2.256 4.434 -2.236 0.6646 4.496 -2.15 1.154 0.4458 0.0279 3.312 2.945 5.23 -0.2654 3.271 -1.746 1.7295 1.629 -2.176 -2.691 3.205 1.074 1.76 -0.7627 -1.784 -4.406 1.79 0.6 1.958 -2.275 -0.5933 1.751 -0.537 2.596 -2.477 1.543 1.438 2.348 -2.04 -0.6733 0.996 4.36 -0.3481 3.121 2.104 9.32 -3.506 -2.201 -1.532 -2.73 -0.5127 -2.338 1.804 -1.764 0.188 -1.785 -2.139 2.11 3.78 -2.707 -0.769 -4.387 -3.314 -1.28 2.287 0.8066 -1.015 4.6 2.398 -0.236 -0.0726 4.68 0.249 -0.4038 -3.021 2.203 -0.9307 -2.176 -0.5586 -1.822 2.996 0.4597 3.436 1.374 1.607 -2.105 -4.758 1.247 -0.301 -4.027 -0.519 -1.537 -2.586 0.5664 -2.795 -3.436 -2.99 3.846 ] y = [-2.154 3.691 3.297 -1.684 0.363 2.184 -3.324 -0.849 -3.432 -0.6387 -1.739 0.6055 -5.3 0.8037 -3.06 1.432 -2.016 -1.009 2.99 -4.625 0.335 1.089 0.1313 -2.838 -1.818 6.13 -2.89 4.062 4.945 1.25 -3.965 1.867 -0.1287 -1.609 -3.889 1.505 -0.4001 -0.1764 -0.8687 -1.04 2.908 3.355 -3.5 0.9717 -2.117 0.787 -1.069 -1.905 -0.7266 0.304 0.2109 0.345 -5.055 -0.5664 -1.787 0.2361 -0.639 -0.805 3.191 3.758 -1.967 0.2725 -2.191 2.264 4.45 -2.244 0.6724 4.516 -2.139 1.15 0.4358 0.0353 3.295 2.95 5.24 -0.2754 3.271 -1.754 1.723 1.618 -2.168 -2.688 3.203 1.086 1.741 -0.757 -1.779 -4.418 1.784 0.6006 1.952 -2.271 -0.589 1.745 -0.5244 2.596 -2.49 1.54 1.4375 2.348 -2.041 -0.674 0.9897 4.36 -0.3489 3.135 2.098 9.33 -3.512 -2.19 -1.554 -2.732 -0.502 -2.328 1.808 -1.761 0.1847 -1.804 -2.135 2.111 3.77 -2.709 -0.7583 -4.387 -3.309 -1.273 2.287 0.7954 -1.014 4.598 2.393 -0.2335 -0.0709 4.68 0.255 -0.4124 -3.012 2.201 -0.924 -2.162 -0.57 -1.83 2.988 0.4563 3.426 1.372 1.615 -2.121 -4.754 1.246 -0.2986 -4.01 -0.515 -1.527 -2.584 0.564 -2.805 -3.44 -3. 3.85 ] 32.16995226414111 32.16520685953384 -------------------------------------------------------------------------------- checking out_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.41093524932861336 x = [-0.699 -1.602 -0.7886 1.747 0.701 1.194 -1.911 -4.254 -3.168 -2.99 -1.663 3.09 0.2421 -3.496 2.846 -1.929 3.303 -2.328 0.6465 -0.7114 0.508 3.838 -1.556 -0.8354 2.643 4.492 -1.401 5.938 5.09 1.344 -3.787 -1.641 -1.575 -3.123 -0.8887 -3.428 1.361 -3.578 2.492 1.028 -1.657 3.941 3.564 2.47 -5.188 -1.655 -0.345 3.633 0.643 0.11584 1.887 -2.654 1.175 0.9106 -0.9175 -2.658 -0.1843 -0.677 -1.73 -1.101 2.371 0.2998 0.2678 3.623 -0.4863 -3.838 -0.914 -0.1754 -1.191 4.824 -1.142 -0.7856 0.899 -1.953 0.642 -0.734 -0.57 1.387 1.972 3.717 0.742 -2.932 -4.44 2.156 1.636 0.2595 3.53 -4.184 2.645 -4.69 0.5503 -1.508 -1.778 -0.64 -1.65 -1.06 4.09 2.016 -2.799 2.752 0.694 1.141 -0.1503 -2.67 0.4124 0.9775 -0.2394 0.7246 5.457 -2.152 3.033 -1.103 -0.7085 -2.74 -0.03842 -0.3328 1.773 3.604 -0.2236 3.53 -1.693 1.167 -0.9375 -0.6777 0.3423 -0.5576 -0.787 0.2705 0.289 -5.773 -5.23 4.957 -2.031 -1.34 -2.727 5.723 -0.2332 0.4785 4.562 -3.457 -5.652 -4.125 1.271 1.613 -3.484 2.84 2.273 1.708 -2.518 -0.06073 -1.091 -1.297 -0.987 -3.1 -2.574 4.715 0.2698 4.875 3.365 1.178 ] y = [-0.703 -1.597 -0.778 1.743 0.6997 1.191 -1.919 -4.258 -3.18 -2.998 -1.659 3.086 0.244 -3.496 2.84 -1.935 3.309 -2.328 0.643 -0.71 0.4988 3.84 -1.567 -0.8286 2.645 4.492 -1.397 5.94 5.09 1.343 -3.791 -1.638 -1.572 -3.123 -0.8896 -3.436 1.354 -3.57 2.494 1.024 -1.647 3.932 3.553 2.477 -5.188 -1.664 -0.337 3.627 0.645 0.1078 1.886 -2.643 1.181 0.914 -0.93 -2.654 -0.1871 -0.685 -1.734 -1.099 2.37 0.2986 0.266 3.621 -0.4905 -3.838 -0.915 -0.1671 -1.191 4.816 -1.137 -0.787 0.91 -1.959 0.644 -0.7407 -0.5796 1.377 1.968 3.72 0.7466 -2.941 -4.445 2.158 1.6455 0.274 3.533 -4.188 2.65 -4.688 0.5615 -1.504 -1.774 -0.6484 -1.651 -1.075 4.1 2.016 -2.805 2.75 0.699 1.138 -0.146 -2.674 0.4167 0.9697 -0.2264 0.7246 5.465 -2.15 3.037 -1.106 -0.7075 -2.75 -0.03873 -0.3408 1.761 3.592 -0.2252 3.518 -1.694 1.144 -0.9375 -0.6924 0.3464 -0.5522 -0.781 0.271 0.2834 -5.77 -5.242 4.953 -2.031 -1.346 -2.719 5.71 -0.2391 0.4863 4.55 -3.455 -5.652 -4.113 1.267 1.616 -3.484 2.846 2.271 1.7 -2.506 -0.06058 -1.09 -1.299 -0.9844 -3.094 -2.572 4.715 0.2688 4.88 3.371 1.17 ] 32.26857736736692 32.26568461881976 -------------------------------------------------------------------------------- checking out_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.05252783313114197 x = [ 0.02803 -0.097 0.0825 ... 0.2238 0.1324 -0.1027 ] y = [ 0.03091 -0.0912 0.0816 ... 0.2246 0.1311 -0.10205] 53.10149666194258 53.099606981870735 -------------------------------------------------------------------------------- checking int_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.06004666924476624 x = [ 0.2253 -0.301 0.1486 0.1338 0.2725 -0.7393 -0.004585 0.04437 -0.7925 0.735 -0.4563 0.0804 -0.09033 -0.1202 0.3535 -0.1506 -0.3286 -0.2131 -0.1426 -0.723 0.8613 -0.0289 -0.1772 -0.2708 -0.2197 0.2668 -0.3782 -0.2131 0.7095 -0.0761 -0.341 -0.1293 -1.094 -0.3926 0.0851 0.4675 0.2583 -0.2717 0.889 -0.527 -0.3384 0.03348 -0.1377 -0.2659 0.1545 0.197 -0.1247 0.8213 -0.4504 -0.7817 -0.53 -0.4617 -0.4119 0.1238 -0.0974 -0.8584 -0.34 -0.3252 -0.0892 0.2197 -0.07556 -0.234 -0.2754 0.7153 0.2003 0.261 0.2162 0.0589 -0.1573 -0.574 -0.11896 0.8066 0.02644 0.3765 -0.3425 0.1976 -0.524 0.4436 0.163 -0.2257 0.2362 -0.1357 -0.1805 0.3022 -0.2825 -0.0748 0.1821 0.2559 0.2218 -0.487 0.3335 0.001225 0.05582 -0.2935 0.168 -0.0886 -0.2389 -0.01935 -0.4775 -0.2147 0.1697 -0.1066 0.1989 -0.4539 -0.258 -0.612 -0.1823 0.2147 0.02687 -0.04465 -0.3755 -0.08154 -0.3667 -0.2064 0.7046 -0.3235 -0.2527 -0.2308 0.2612 -0.8335 -0.0551 -0.383 0.277 0.269 -0.1316 -0.5737 0.6035 0.306 -0.04623 -0.372 -0.5083 0.526 0.462 0.0486 0.3096 0.5513 -0.3193 -0.743 0.3755 -0.1001 -0.1406 -0.2888 0.1079 0.1082 0.02493 -0.7725 -0.0375 0.1819 0.2335 -0.0893 0.27 -0.01375 0.02681 0.2947 0.3086 0.1466 0.01004 -0.2344 0.6143 0.5503 ] y = [ 2.2424e-01 -3.0151e-01 1.4807e-01 1.3416e-01 2.7148e-01 -7.3975e-01 -7.5302e-03 4.4464e-02 -7.9395e-01 7.3438e-01 -4.5532e-01 7.9163e-02 -9.2163e-02 -1.2018e-01 3.5571e-01 -1.5125e-01 -3.2739e-01 -2.1606e-01 -1.4319e-01 -7.2510e-01 8.6084e-01 -3.0334e-02 -1.7725e-01 -2.7002e-01 -2.1838e-01 2.6685e-01 -3.8013e-01 -2.1228e-01 7.1045e-01 -7.5500e-02 -3.4009e-01 -1.2939e-01 -1.0947e+00 -3.9136e-01 8.4473e-02 4.6851e-01 2.6050e-01 -2.7026e-01 8.9014e-01 -5.2637e-01 -3.3911e-01 3.3173e-02 -1.3892e-01 -2.6855e-01 1.5613e-01 1.9922e-01 -1.2347e-01 8.2080e-01 -4.5142e-01 -7.7930e-01 -5.2979e-01 -4.6313e-01 -4.1089e-01 1.2372e-01 -9.6680e-02 -8.6084e-01 -3.3911e-01 -3.2690e-01 -9.0576e-02 2.1912e-01 -7.5073e-02 -2.3230e-01 -2.7417e-01 7.1582e-01 1.9971e-01 2.6196e-01 2.1509e-01 6.0181e-02 -1.5503e-01 -5.7275e-01 -1.1896e-01 8.0811e-01 2.7649e-02 3.7695e-01 -3.4302e-01 1.9653e-01 -5.2246e-01 4.4385e-01 1.6235e-01 -2.2717e-01 2.3584e-01 -1.3672e-01 -1.7981e-01 3.0347e-01 -2.8296e-01 -7.3669e-02 1.8274e-01 2.5610e-01 2.1863e-01 -4.8779e-01 3.3423e-01 -8.9312e-04 5.6885e-02 -2.9346e-01 1.6846e-01 -8.8928e-02 -2.3975e-01 -2.0645e-02 -4.7876e-01 -2.1436e-01 1.6956e-01 -1.0699e-01 1.9995e-01 -4.5239e-01 -2.5781e-01 -6.1279e-01 -1.8396e-01 2.1204e-01 2.9099e-02 -4.5258e-02 -3.7646e-01 -7.9407e-02 -3.6523e-01 -2.0422e-01 7.0459e-01 -3.2471e-01 -2.5391e-01 -2.2998e-01 2.6123e-01 -8.3203e-01 -5.4749e-02 -3.8525e-01 2.7759e-01 2.6978e-01 -1.2903e-01 -5.7471e-01 6.0352e-01 3.0566e-01 -4.7699e-02 -3.7231e-01 -5.0928e-01 5.2539e-01 4.6240e-01 4.8401e-02 3.0981e-01 5.5078e-01 -3.1836e-01 -7.4219e-01 3.7573e-01 -9.9060e-02 -1.3989e-01 -2.8760e-01 1.0730e-01 1.0980e-01 2.5589e-02 -7.7295e-01 -3.7811e-02 1.8054e-01 2.3254e-01 -8.8867e-02 2.6904e-01 -1.5282e-02 2.5925e-02 2.9395e-01 3.0981e-01 1.4392e-01 1.0689e-02 -2.3499e-01 6.1279e-01 5.4980e-01] 4.761150708750266 4.762173680171811 -------------------------------------------------------------------------------- checking int_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.05514505815086887 x = [ 0.7466 -0.21 -0.788 ... 0.4272 -0.206 0.458 ] y = [ 0.7373 -0.2072 -0.781 ... 0.421 -0.2095 0.4502] 55.44837514266085 55.451459230757614 -------------------------------------------------------------------------------- checking N2_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4117816734313965 x = [-0.652 -1.39 -0.7324 1.737 0.731 1.218 -1.964 -4.37 -3.06 -2.93 -1.553 3.129 0.264 -3.447 2.729 -1.91 3.344 -2.145 0.6875 -0.8794 0.568 3.928 -1.515 -0.969 2.727 4.6 -1.089 6.062 5.098 1.238 -3.79 -1.8 -1.444 -3.08 -0.9233 -3.332 1.297 -3.639 2.328 1.083 -1.637 4.105 3.607 2.52 -5.117 -1.565 -0.2927 3.71 0.6123 0.2703 2.03 -2.697 1.371 0.9746 -0.8535 -2.621 -0.02702 -0.5215 -1.674 -0.9644 2.447 0.3713 0.4124 3.697 -0.7207 -3.861 -0.9756 -0.02646 -1.175 4.855 -1.159 -0.9214 0.874 -1.938 0.6694 -0.6587 -0.4958 1.484 2.129 3.72 0.7036 -2.916 -4.426 2.248 1.672 0.1472 3.44 -4.016 2.682 -4.79 0.599 -1.359 -1.804 -0.4038 -1.6455 -0.9805 4.215 1.976 -3.01 2.898 0.569 1.181 -0.2954 -2.781 0.2908 1.107 -0.2012 0.4685 5.344 -2.137 3.16 -1.155 -0.739 -2.72 0.02283 -0.3054 1.854 3.55 -0.1598 3.62 -1.603 1.195 -1.138 -0.6826 0.2732 -0.4392 -0.8457 0.306 0.3088 -5.832 -5.137 4.867 -2.016 -1.323 -2.863 5.695 -0.255 0.4954 4.625 -3.537 -5.684 -4.297 1.349 1.556 -3.486 2.895 2.242 1.682 -2.635 -0.0452 -1.065 -1.435 -1.055 -3.104 -2.379 4.8 0.2825 4.766 3.383 1.113 ] y = [-0.6553 -1.385 -0.722 1.734 0.7305 1.215 -1.97 -4.375 -3.074 -2.94 -1.549 3.125 0.2664 -3.45 2.72 -1.916 3.35 -2.145 0.684 -0.8784 0.559 3.93 -1.525 -0.964 2.729 4.605 -1.084 6.066 5.1 1.235 -3.793 -1.797 -1.441 -3.08 -0.9243 -3.34 1.289 -3.63 2.33 1.079 -1.626 4.098 3.596 2.525 -5.117 -1.574 -0.285 3.703 0.614 0.2625 2.027 -2.686 1.378 0.978 -0.8647 -2.615 -0.02917 -0.5303 -1.678 -0.9624 2.447 0.3696 0.4106 3.695 -0.725 -3.861 -0.9775 -0.01797 -1.174 4.848 -1.155 -0.9224 0.8857 -1.943 0.672 -0.666 -0.507 1.475 2.125 3.723 0.708 -2.926 -4.426 2.25 1.682 0.1621 3.441 -4.02 2.688 -4.785 0.61 -1.3545 -1.8 -0.4114 -1.6455 -0.997 4.223 1.977 -3.016 2.895 0.5737 1.176 -0.2915 -2.787 0.2957 1.1 -0.1875 0.4685 5.348 -2.135 3.162 -1.159 -0.7383 -2.73 0.0226 -0.3135 1.841 3.54 -0.1611 3.607 -1.603 1.172 -1.138 -0.6973 0.2769 -0.434 -0.8394 0.307 0.3025 -5.832 -5.152 4.867 -2.016 -1.329 -2.86 5.684 -0.2612 0.503 4.613 -3.533 -5.684 -4.285 1.344 1.558 -3.486 2.902 2.24 1.673 -2.625 -0.04626 -1.066 -1.437 -1.052 -3.1 -2.377 4.8 0.281 4.77 3.389 1.105 ] 32.40440421207437 32.40234615972695 -------------------------------------------------------------------------------- checking N2_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4119118690490723 x = [-2.135 3.46 3.244 -1.613 0.4841 2.299 -3.19 -0.702 -3.494 -0.588 -1.737 0.5693 -5.312 0.6035 -2.854 1.553 -2.12 -0.985 2.822 -4.652 0.352 1.311 0.2361 -2.701 -1.978 6.12 -2.79 3.785 4.72 1.251 -3.967 1.781 -0.1791 -1.529 -3.873 1.675 -0.489 -0.2064 -1.061 -1.234 3.027 3.361 -3.53 1.02 -2.035 0.813 -1.185 -2.115 -0.7676 0.3728 0.3237 0.4314 -5.113 -0.658 -1.6045 0.3088 -0.7266 -0.87 3.041 3.861 -1.592 0.5845 -2.3 2.139 4.63 -2.213 0.5957 4.332 -2.074 1.121 0.4065 -0.02943 3.314 2.953 5.215 -0.3604 3.283 -1.632 1.604 1.544 -2.307 -2.955 3.102 1.121 1.81 -0.8936 -1.725 -4.28 1.805 0.558 1.955 -2.127 -0.709 1.687 -0.5312 2.889 -2.605 1.436 1.403 2.354 -2.057 -0.61 0.998 4.402 -0.2827 3.14 2.162 9.11 -3.625 -2.02 -1.429 -2.762 -0.502 -2.096 1.776 -1.9375 0.07733 -1.667 -2.18 1.949 3.736 -2.688 -0.78 -4.37 -3.21 -1.163 2.371 0.7944 -1.229 4.66 2.244 -0.257 -0.052 4.69 0.1774 -0.3105 -3.002 2.37 -0.7866 -2.04 -0.5586 -1.92 3.178 0.319 3.338 1.328 1.511 -2.176 -4.812 1.367 -0.379 -4.164 -0.4985 -1.6875 -2.24 0.507 -2.89 -3.434 -2.824 3.691 ] y = [-2.135 3.445 3.238 -1.6045 0.483 2.29 -3.19 -0.699 -3.477 -0.581 -1.743 0.5645 -5.312 0.6 -2.852 1.556 -2.12 -0.9795 2.84 -4.645 0.349 1.318 0.233 -2.707 -1.977 6.113 -2.775 3.793 4.72 1.241 -3.973 1.781 -0.1768 -1.526 -3.883 1.671 -0.4856 -0.1997 -1.056 -1.229 3.027 3.373 -3.549 1.02 -2.027 0.8047 -1.191 -2.13 -0.7607 0.3728 0.3264 0.4275 -5.11 -0.6685 -1.61 0.3132 -0.7188 -0.873 3.043 3.86 -1.602 0.581 -2.295 2.145 4.65 -2.223 0.6045 4.35 -2.064 1.114 0.396 -0.02415 3.297 2.96 5.227 -0.365 3.281 -1.642 1.599 1.534 -2.299 -2.953 3.1 1.135 1.795 -0.89 -1.72 -4.293 1.8 0.5566 1.95 -2.125 -0.705 1.685 -0.521 2.887 -2.621 1.432 1.404 2.355 -2.06 -0.617 0.99 4.402 -0.285 3.158 2.154 9.12 -3.627 -2.006 -1.452 -2.766 -0.4958 -2.084 1.78 -1.9375 0.0748 -1.686 -2.176 1.95 3.727 -2.69 -0.771 -4.37 -3.203 -1.156 2.371 0.7817 -1.229 4.656 2.24 -0.2517 -0.05087 4.69 0.1821 -0.3135 -2.992 2.373 -0.778 -2.025 -0.5723 -1.93 3.174 0.318 3.33 1.324 1.519 -2.188 -4.805 1.369 -0.3787 -4.145 -0.4946 -1.678 -2.24 0.5044 -2.9 -3.438 -2.836 3.697 ] 31.976714530776253 31.9796942422391 -------------------------------------------------------------------------------- checking O_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.4110921907424927 x = [-6.7432e-01 -1.4082e+00 -7.5195e-01 1.7178e+00 7.1533e-01 1.1992e+00 -1.9785e+00 -4.3828e+00 -3.0781e+00 -2.9395e+00 -1.5654e+00 3.0977e+00 2.4084e-01 -3.4590e+00 2.6973e+00 -1.9297e+00 3.3105e+00 -2.1582e+00 6.6846e-01 -8.9697e-01 5.4688e-01 3.9004e+00 -1.5371e+00 -9.7754e-01 2.6992e+00 4.5625e+00 -1.1035e+00 6.0273e+00 5.0742e+00 1.2227e+00 -3.8008e+00 -1.8135e+00 -1.4658e+00 -3.0977e+00 -9.3262e-01 -3.3379e+00 1.2793e+00 -3.6465e+00 2.3086e+00 1.0625e+00 -1.6514e+00 4.0820e+00 3.5723e+00 2.4961e+00 -5.1172e+00 -1.5820e+00 -3.1299e-01 3.6855e+00 5.9473e-01 2.4329e-01 2.0137e+00 -2.7188e+00 1.3535e+00 9.5752e-01 -8.6475e-01 -2.6328e+00 -3.9825e-02 -5.4199e-01 -1.6846e+00 -9.8047e-01 2.4219e+00 3.5156e-01 3.9185e-01 3.6758e+00 -7.4121e-01 -3.8711e+00 -9.9170e-01 -4.1809e-02 -1.1943e+00 4.8281e+00 -1.1738e+00 -9.3848e-01 8.4863e-01 -1.9482e+00 6.4258e-01 -6.8359e-01 -5.2100e-01 1.4609e+00 2.1035e+00 3.6895e+00 6.8506e-01 -2.9277e+00 -4.4297e+00 2.2207e+00 1.6533e+00 1.2988e-01 3.4082e+00 -4.0273e+00 2.6543e+00 -4.8008e+00 5.8594e-01 -1.3770e+00 -1.8154e+00 -4.2700e-01 -1.6641e+00 -9.9756e-01 4.1875e+00 1.9580e+00 -3.0312e+00 2.8730e+00 5.5225e-01 1.1582e+00 -3.1348e-01 -2.8008e+00 2.7466e-01 1.0898e+00 -2.2144e-01 4.4897e-01 5.3125e+00 -2.1523e+00 3.1250e+00 -1.1738e+00 -7.5781e-01 -2.7441e+00 -4.6730e-03 -3.2666e-01 1.8301e+00 3.5234e+00 -1.7810e-01 3.5996e+00 -1.6084e+00 1.1660e+00 -1.1562e+00 -6.9629e-01 2.4646e-01 -4.5581e-01 -8.5791e-01 2.8955e-01 2.9443e-01 -5.8359e+00 -5.1445e+00 4.8438e+00 -2.0312e+00 -1.3379e+00 -2.8711e+00 5.6562e+00 -2.7637e-01 4.7363e-01 4.5977e+00 -3.5410e+00 -5.6875e+00 -4.3047e+00 1.3330e+00 1.5391e+00 -3.4922e+00 2.8750e+00 2.2188e+00 1.6592e+00 -2.6387e+00 -5.9174e-02 -1.0762e+00 -1.4580e+00 -1.0713e+00 -3.1191e+00 -2.3828e+00 4.7812e+00 2.6782e-01 4.7305e+00 3.3594e+00 1.0869e+00] y = [-6.7383e-01 -1.4023e+00 -7.3975e-01 1.7158e+00 7.1631e-01 1.1982e+00 -1.9854e+00 -4.3867e+00 -3.0898e+00 -2.9473e+00 -1.5615e+00 3.0938e+00 2.4402e-01 -3.4570e+00 2.6895e+00 -1.9346e+00 3.3184e+00 -2.1562e+00 6.6699e-01 -8.9453e-01 5.4053e-01 3.9043e+00 -1.5469e+00 -9.7314e-01 2.6992e+00 4.5703e+00 -1.0996e+00 6.0312e+00 5.0742e+00 1.2188e+00 -3.8008e+00 -1.8105e+00 -1.4590e+00 -3.0938e+00 -9.3408e-01 -3.3457e+00 1.2705e+00 -3.6367e+00 2.3086e+00 1.0596e+00 -1.6406e+00 4.0742e+00 3.5605e+00 2.5020e+00 -5.1172e+00 -1.5918e+00 -3.0420e-01 3.6797e+00 5.9326e-01 2.3792e-01 2.0156e+00 -2.7031e+00 1.3633e+00 9.5898e-01 -8.7598e-01 -2.6289e+00 -4.2480e-02 -5.4883e-01 -1.6875e+00 -9.7754e-01 2.4219e+00 3.5034e-01 3.9136e-01 3.6777e+00 -7.4316e-01 -3.8691e+00 -9.8877e-01 -3.2043e-02 -1.1914e+00 4.8242e+00 -1.1719e+00 -9.3848e-01 8.5986e-01 -1.9531e+00 6.4795e-01 -6.8750e-01 -5.2832e-01 1.4541e+00 2.0996e+00 3.6914e+00 6.8994e-01 -2.9375e+00 -4.4258e+00 2.2246e+00 1.6631e+00 1.4673e-01 3.4102e+00 -4.0273e+00 2.6562e+00 -4.7930e+00 5.9717e-01 -1.3711e+00 -1.8086e+00 -4.3262e-01 -1.6641e+00 -1.0117e+00 4.1953e+00 1.9580e+00 -3.0332e+00 2.8730e+00 5.5566e-01 1.1553e+00 -3.0933e-01 -2.8027e+00 2.8149e-01 1.0811e+00 -2.0422e-01 4.5190e-01 5.3203e+00 -2.1523e+00 3.1289e+00 -1.1758e+00 -7.5537e-01 -2.7500e+00 -2.5482e-03 -3.3203e-01 1.8164e+00 3.5098e+00 -1.7871e-01 3.5879e+00 -1.6064e+00 1.1445e+00 -1.1533e+00 -7.1045e-01 2.5098e-01 -4.4727e-01 -8.5010e-01 2.9126e-01 2.8687e-01 -5.8320e+00 -5.1562e+00 4.8398e+00 -2.0273e+00 -1.3369e+00 -2.8652e+00 5.6445e+00 -2.7979e-01 4.8193e-01 4.5859e+00 -3.5391e+00 -5.6875e+00 -4.2891e+00 1.3291e+00 1.5420e+00 -3.4902e+00 2.8828e+00 2.2148e+00 1.6494e+00 -2.6289e+00 -5.9021e-02 -1.0771e+00 -1.4600e+00 -1.0654e+00 -3.1133e+00 -2.3789e+00 4.7812e+00 2.6807e-01 4.7344e+00 3.3652e+00 1.0801e+00] 32.33044527625307 32.32124280184641 -------------------------------------------------------------------------------- checking O_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.02581151763442904 x = [ 0.09595 -0.2742 0.1761 ... -0.2106 -0.3486 -0.007236] y = [ 0.0945 -0.2756 0.177 ... -0.2146 -0.347 -0.007706] 26.193660705032386 26.196761200334766 -------------------------------------------------------------------------------- checking V_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.10295522451400757 x = [ 1.438 0.1332 1.178 0.3782 -0.02638 0.721 -0.3682 0.5903 -0.1636 -0.4756 0.9067 -0.348 0.568 -0.1714 -0.6973 0.0998 -0.737 0.8877 -0.4736 -0.75 -0.1921 -0.7324 0.6514 0.714 0.03668 -0.3887 -0.656 0.2603 -0.02373 1.328 -0.1648 1.0205 -0.003046 -0.4307 0.6587 -0.1649 0.3555 -1.056 1.721 0.05057 -0.1345 0.2041 0.3674 0.467 -0.748 -0.484 -0.03555 -0.5693 0.0673 -0.4138 -0.4324 0.1228 -0.3372 -1.092 0.5312 0.1912 0.562 0.3843 -0.52 0.3228 0.278 -0.5557 -2.012 -0.611 -0.9717 0.3228 0.4573 -0.3074 -1.071 0.3423 1.058 0.5483 0.3652 0.005447 1.346 0.2886 -1.335 0.0927 0.1451 0.973 0.5347 0.8296 0.942 0.791 0.3743 -0.769 0.08466 0.1414 0.04724 -0.6704 -0.67 -0.3716 -0.3064 0.2374 -0.5615 0.5776 0.3376 0.3425 -0.5454 0.707 -0.1361 -0.9067 -0.2083 0.5713 -0.7095 -1.148 0.3613 -0.687 1.397 -0.8994 -0.3062 -0.2466 -0.7515 -0.2012 -0.895 0.1718 0.2316 -0.8193 0.3977 -0.971 0.1399 0.07477 -0.511 0.5894 -0.1606 -0.1783 0.662 -0.2632 0.05005 -1.482 -0.924 0.3362 -0.714 -0.1692 -0.326 -0.327 -0.1912 -0.1423 0.366 0.204 -0.2957 -2.316 -0.9917 -0.004444 -0.0879 -0.527 0.4985 0.1492 -0.2542 0.1207 0.873 0.02208 0.614 0.01883 -0.312 -0.1044 0.562 -0.98 0.04312 -1.127 ] y = [ 1.4395e+00 1.3562e-01 1.1787e+00 3.7793e-01 -2.8244e-02 7.2070e-01 -3.7183e-01 5.9131e-01 -1.6541e-01 -4.7510e-01 9.0918e-01 -3.4766e-01 5.6738e-01 -1.7261e-01 -6.9775e-01 9.8877e-02 -7.3389e-01 8.8623e-01 -4.7559e-01 -7.4805e-01 -1.8799e-01 -7.2852e-01 6.5332e-01 7.1240e-01 3.2837e-02 -3.8599e-01 -6.5820e-01 2.5708e-01 -2.4673e-02 1.3330e+00 -1.6553e-01 1.0186e+00 2.6155e-04 -4.3091e-01 6.5967e-01 -1.6553e-01 3.5303e-01 -1.0527e+00 1.7227e+00 4.9103e-02 -1.3440e-01 2.0239e-01 3.6621e-01 4.6655e-01 -7.5000e-01 -4.8291e-01 -3.5156e-02 -5.6934e-01 6.6956e-02 -4.1504e-01 -4.2822e-01 1.2225e-01 -3.3740e-01 -1.0889e+00 5.2832e-01 1.9019e-01 5.6299e-01 3.8477e-01 -5.2002e-01 3.2666e-01 2.7783e-01 -5.5273e-01 -2.0098e+00 -6.0938e-01 -9.7168e-01 3.2373e-01 4.5825e-01 -3.0908e-01 -1.0674e+00 3.3911e-01 1.0576e+00 5.5078e-01 3.6621e-01 5.7182e-03 1.3477e+00 2.8979e-01 -1.3311e+00 9.3628e-02 1.4294e-01 9.7412e-01 5.3467e-01 8.3105e-01 9.3945e-01 7.9297e-01 3.7622e-01 -7.6807e-01 8.6182e-02 1.4368e-01 4.7485e-02 -6.7188e-01 -6.7188e-01 -3.7207e-01 -3.0981e-01 2.3828e-01 -5.6250e-01 5.7812e-01 3.3911e-01 3.3911e-01 -5.4785e-01 7.0361e-01 -1.3867e-01 -9.0479e-01 -2.0911e-01 5.7275e-01 -7.0898e-01 -1.1514e+00 3.6255e-01 -6.8750e-01 1.3945e+00 -8.9697e-01 -3.0713e-01 -2.4561e-01 -7.5000e-01 -2.0117e-01 -8.9551e-01 1.7310e-01 2.3120e-01 -8.1836e-01 3.9722e-01 -9.7021e-01 1.4026e-01 7.4219e-02 -5.1123e-01 5.9033e-01 -1.5759e-01 -1.7664e-01 6.6162e-01 -2.6465e-01 4.9652e-02 -1.4814e+00 -9.2725e-01 3.4180e-01 -7.1484e-01 -1.6846e-01 -3.2495e-01 -3.2739e-01 -1.9189e-01 -1.4734e-01 3.6816e-01 2.0361e-01 -2.9492e-01 -2.3164e+00 -9.9365e-01 -3.6793e-03 -8.9905e-02 -5.2734e-01 4.9902e-01 1.5002e-01 -2.5439e-01 1.1914e-01 8.7207e-01 2.3773e-02 6.1377e-01 1.7288e-02 -3.0957e-01 -1.0468e-01 5.6299e-01 -9.7754e-01 4.1718e-02 -1.1279e+00] 8.292709943670728 8.291894923319504 -------------------------------------------------------------------------------- checking V_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.025948193471413108 x = [-0.07227 0.1855 -0.03473 ... -0.3845 -0.05835 0.0185 ] y = [-0.07526 0.1868 -0.03607 ... -0.3909 -0.05936 0.01842] 26.553571595373274 26.56560934658646 -------------------------------------------------------------------------------- ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:32:03.830333 507326 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:32:03.830323 507197 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:32:03.830878 507197 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-1: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 284, in test_backward run_backward(ds_config, seq_len, atol=atol, verbose=True) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 243, in run_backward check_equal(base_grads, ds_grads, atol=atol, verbose=verbose) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 82, in check_equal np.testing.assert_allclose(x, y, err_msg="Index: {}".format(i), atol=tolerance) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1531, in assert_allclose verbose=verbose, header=header, equal_nan=equal_nan) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=0.0259482 Index: 0 Mismatched elements: 1 / 25600 (0.00391%) Max absolute difference: 0.0271 Max relative difference: 232.2 x: array([-0.07227, 0.1855 , -0.03473, ..., -0.3845 , -0.05835, 0.0185 ], dtype=float16) y: array([-0.07526, 0.1868 , -0.03607, ..., -0.3909 , -0.05936, 0.01842], dtype=float16) ________ TestCUDABackward.test_backward[8-1600-128-25-3-True-True-0.05] ________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. [2023-05-27 03:32:15,332] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl layer #0 is created with date type [half]. layer #1 is created with date type [half]. layer #2 is created with date type [half]. DeepSpeed Transformer config is {'layer_id': 0, 'batch_size': 8, 'hidden_size': 1600, 'intermediate_size': 1600, 'heads': 25, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 1, 'batch_size': 8, 'hidden_size': 1600, 'intermediate_size': 1600, 'heads': 25, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 2, 'batch_size': 8, 'hidden_size': 1600, 'intermediate_size': 1600, 'heads': 25, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} hidden_state hidden_state out_B V_W out_W V_B int_B N2_B int_W norm_B N2_B norm_W N2_W out_B O_B out_W O_W int_B V_B N2_W V_W O_W norm_B int_W norm_W O_B hidden_state hidden_state out_B V_W out_W V_B int_B N2_B int_W norm_B N2_B norm_W N2_W out_B O_B out_W O_W int_B V_B N2_W V_W O_W norm_B int_W norm_W O_B hidden_state hidden_state out_B V_B out_W N2_B int_B norm_B int_W norm_W N2_B out_B N2_W out_W O_B int_B O_W N2_W V_B int_W V_W O_B norm_B V_W norm_W O_W (0, 'hidden_state') (0, 'hidden_state') (0, 'out_B') (0, 'V_W') (0, 'out_W') (0, 'V_B') (0, 'int_B') (0, 'N2_B') (0, 'int_W') (0, 'norm_B') (0, 'N2_B') (0, 'norm_W') (0, 'N2_W') (0, 'out_B') (0, 'O_B') (0, 'out_W') (0, 'O_W') (0, 'int_B') (0, 'V_B') (0, 'N2_W') (0, 'V_W') (0, 'O_W') (0, 'norm_B') (0, 'int_W') (0, 'norm_W') (0, 'O_B') (1, 'hidden_state') (1, 'hidden_state') (1, 'out_B') (1, 'V_W') (1, 'out_W') (1, 'V_B') (1, 'int_B') (1, 'N2_B') (1, 'int_W') (1, 'norm_B') (1, 'N2_B') (1, 'norm_W') (1, 'N2_W') (1, 'out_B') (1, 'O_B') (1, 'out_W') (1, 'O_W') (1, 'int_B') (1, 'V_B') (1, 'N2_W') (1, 'V_W') (1, 'O_W') (1, 'norm_B') (1, 'int_W') (1, 'norm_W') (1, 'O_B') (2, 'hidden_state') (2, 'hidden_state') (2, 'out_B') (2, 'V_B') (2, 'out_W') (2, 'N2_B') (2, 'int_B') (2, 'norm_B') (2, 'int_W') (2, 'norm_W') (2, 'N2_B') (2, 'out_B') (2, 'N2_W') (2, 'out_W') (2, 'O_B') (2, 'int_B') (2, 'O_W') (2, 'N2_W') (2, 'V_B') (2, 'int_W') (2, 'V_W') (2, 'O_B') (2, 'norm_B') (2, 'V_W') (2, 'norm_W') (2, 'O_W') checking hidden_state : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.000783791016887335 x = [-0.011 0.00818 0.02704 ... 0.03043 0.012245 -0.008644] y = [-0.010994 0.00819 0.02707 ... 0.03041 0.01224 -0.00864 ] 25.15958084640982 25.15932676072564 -------------------------------------------------------------------------------- checking out_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.024776211731135846 x = [ 0.3064 -0.4238 0.889 ... -0.04202 -0.7153 0.1147 ] y = [ 0.3066 -0.4238 0.8896 ... -0.0414 -0.7163 0.115 ] 24.857993948643553 24.857537714379134 -------------------------------------------------------------------------------- checking out_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.012196058948566207 x = [ 0.008354 0.0524 0.167 ... -0.2688 -0.01044 0.0944 ] y = [ 0.009445 0.0485 0.166 ... -0.27 -0.01367 0.0955 ] 506.04771436909897 506.0263123910311 -------------------------------------------------------------------------------- ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:32:15.334950 507604 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:32:15.334944 507475 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:32:15.335264 507475 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-2: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 284, in test_backward run_backward(ds_config, seq_len, atol=atol, verbose=True) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 243, in run_backward check_equal(base_grads, ds_grads, atol=atol, verbose=verbose) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 82, in check_equal np.testing.assert_allclose(x, y, err_msg="Index: {}".format(i), atol=tolerance) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1531, in assert_allclose verbose=verbose, header=header, equal_nan=equal_nan) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=0.0121961 Index: 0 Mismatched elements: 2833 / 2560000 (0.111%) Max absolute difference: 0.02832 Max relative difference: 8248. x: array([ 0.008354, 0.0524 , 0.167 , ..., -0.2688 , -0.01044 , 0.0944 ], dtype=float16) y: array([ 0.009445, 0.0485 , 0.166 , ..., -0.27 , -0.01367 , 0.0955 ], dtype=float16) ________ TestCUDABackward.test_backward[8-1600-128-2-3-True-True-0.05] _________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. [2023-05-27 03:32:34,708] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl layer #0 is created with date type [half]. layer #1 is created with date type [half]. layer #2 is created with date type [half]. DeepSpeed Transformer config is {'layer_id': 0, 'batch_size': 8, 'hidden_size': 1600, 'intermediate_size': 1600, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 1, 'batch_size': 8, 'hidden_size': 1600, 'intermediate_size': 1600, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 2, 'batch_size': 8, 'hidden_size': 1600, 'intermediate_size': 1600, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} hidden_state hidden_state out_B V_W out_W V_B int_B N2_B int_W norm_B N2_B norm_W N2_W out_B O_B out_W O_W int_B V_B N2_W V_W O_W norm_B int_W norm_W O_B hidden_state hidden_state out_B V_W out_W V_B int_B N2_B int_W norm_B N2_B norm_W N2_W out_B O_B out_W O_W int_B V_B N2_W V_W O_W norm_B int_W norm_W O_B hidden_state hidden_state out_B V_B out_W N2_B int_B norm_B int_W norm_W N2_B out_B N2_W out_W O_B int_B O_W N2_W V_B int_W V_W O_B norm_B V_W norm_W O_W (0, 'hidden_state') (0, 'hidden_state') (0, 'out_B') (0, 'V_W') (0, 'out_W') (0, 'V_B') (0, 'int_B') (0, 'N2_B') (0, 'int_W') (0, 'norm_B') (0, 'N2_B') (0, 'norm_W') (0, 'N2_W') (0, 'out_B') (0, 'O_B') (0, 'out_W') (0, 'O_W') (0, 'int_B') (0, 'V_B') (0, 'N2_W') (0, 'V_W') (0, 'O_W') (0, 'norm_B') (0, 'int_W') (0, 'norm_W') (0, 'O_B') (1, 'hidden_state') (1, 'hidden_state') (1, 'out_B') (1, 'V_W') (1, 'out_W') (1, 'V_B') (1, 'int_B') (1, 'N2_B') (1, 'int_W') (1, 'norm_B') (1, 'N2_B') (1, 'norm_W') (1, 'N2_W') (1, 'out_B') (1, 'O_B') (1, 'out_W') (1, 'O_W') (1, 'int_B') (1, 'V_B') (1, 'N2_W') (1, 'V_W') (1, 'O_W') (1, 'norm_B') (1, 'int_W') (1, 'norm_W') (1, 'O_B') (2, 'hidden_state') (2, 'hidden_state') (2, 'out_B') (2, 'V_B') (2, 'out_W') (2, 'N2_B') (2, 'int_B') (2, 'norm_B') (2, 'int_W') (2, 'norm_W') (2, 'N2_B') (2, 'out_B') (2, 'N2_W') (2, 'out_W') (2, 'O_B') (2, 'int_B') (2, 'O_W') (2, 'N2_W') (2, 'V_B') (2, 'int_W') (2, 'V_W') (2, 'O_B') (2, 'norm_B') (2, 'V_W') (2, 'norm_W') (2, 'O_W') checking hidden_state : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.000783190738875419 x = [-0.01097 0.00828 0.02718 ... 0.03038 0.0121 -0.00865] y = [-0.01096 0.00828 0.02716 ... 0.03038 0.0121 -0.00865] 25.141463138878077 25.140946688218005 -------------------------------------------------------------------------------- checking out_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.024754941791296007 x = [ 0.3125 -0.4236 0.896 ... -0.03555 -0.7163 0.111 ] y = [ 0.3127 -0.4236 0.8955 ... -0.0366 -0.716 0.11066] 24.83937926513994 24.838126981502835 -------------------------------------------------------------------------------- checking out_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.012197172586770031 x = [-0.00677 0.02844 0.1903 ... -0.2766 -0.0748 0.06195] y = [-0.00771 0.02774 0.1925 ... -0.2769 -0.0718 0.06082] 506.0907217639851 506.06327243912375 -------------------------------------------------------------------------------- ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:32:34.710397 508031 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:32:34.710415 508160 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:32:34.710745 508031 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-4: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 284, in test_backward run_backward(ds_config, seq_len, atol=atol, verbose=True) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 243, in run_backward check_equal(base_grads, ds_grads, atol=atol, verbose=verbose) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 82, in check_equal np.testing.assert_allclose(x, y, err_msg="Index: {}".format(i), atol=tolerance) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1531, in assert_allclose verbose=verbose, header=header, equal_nan=equal_nan) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=0.0121972 Index: 0 Mismatched elements: 2770 / 2560000 (0.108%) Max absolute difference: 0.0293 Max relative difference: 6076. x: array([-0.00677, 0.02844, 0.1903 , ..., -0.2766 , -0.0748 , 0.06195], dtype=float16) y: array([-0.00771, 0.02774, 0.1925 , ..., -0.2769 , -0.0718 , 0.06082], dtype=float16) ________ TestCUDABackward.test_backward[64-1600-128-2-4-False-True-0.2] ________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. [2023-05-27 03:32:45,419] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl layer #0 is created with date type [half]. layer #1 is created with date type [half]. layer #2 is created with date type [half]. layer #3 is created with date type [half]. DeepSpeed Transformer config is {'layer_id': 0, 'batch_size': 64, 'hidden_size': 1600, 'intermediate_size': 1600, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 4, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 1, 'batch_size': 64, 'hidden_size': 1600, 'intermediate_size': 1600, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 4, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 2, 'batch_size': 64, 'hidden_size': 1600, 'intermediate_size': 1600, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 4, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 3, 'batch_size': 64, 'hidden_size': 1600, 'intermediate_size': 1600, 'heads': 2, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 4, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_W norm_W V_B out_B N2_B out_W norm_B int_B norm_W int_W out_B N2_B out_W N2_W int_B O_B N2_W O_W O_W V_B int_W V_W O_B hidden_state hidden_state norm_B V_B norm_W N2_B out_B norm_B out_W norm_W int_B out_B int_W out_W N2_B int_B N2_W N2_W O_B int_W O_W O_B V_B V_W V_W O_W (0, 'hidden_state') (0, 'hidden_state') (0, 'norm_B') (0, 'V_W') (0, 'norm_W') (0, 'V_B') (0, 'out_B') (0, 'N2_B') (0, 'out_W') (0, 'norm_B') (0, 'int_B') (0, 'norm_W') (0, 'int_W') (0, 'out_B') (0, 'N2_B') (0, 'out_W') (0, 'N2_W') (0, 'int_B') (0, 'O_B') (0, 'N2_W') (0, 'O_W') (0, 'O_W') (0, 'V_B') (0, 'int_W') (0, 'V_W') (0, 'O_B') (1, 'hidden_state') (1, 'hidden_state') (1, 'norm_B') (1, 'V_W') (1, 'norm_W') (1, 'V_B') (1, 'out_B') (1, 'N2_B') (1, 'out_W') (1, 'norm_B') (1, 'int_B') (1, 'norm_W') (1, 'int_W') (1, 'out_B') (1, 'N2_B') (1, 'out_W') (1, 'N2_W') (1, 'int_B') (1, 'O_B') (1, 'N2_W') (1, 'O_W') (1, 'O_W') (1, 'V_B') (1, 'int_W') (1, 'V_W') (1, 'O_B') (2, 'hidden_state') (2, 'hidden_state') (2, 'norm_B') (2, 'V_W') (2, 'norm_W') (2, 'V_B') (2, 'out_B') (2, 'N2_B') (2, 'out_W') (2, 'norm_B') (2, 'int_B') (2, 'norm_W') (2, 'int_W') (2, 'out_B') (2, 'N2_B') (2, 'out_W') (2, 'N2_W') (2, 'int_B') (2, 'O_B') (2, 'N2_W') (2, 'O_W') (2, 'O_W') (2, 'V_B') (2, 'int_W') (2, 'V_W') (2, 'O_B') (3, 'hidden_state') (3, 'hidden_state') (3, 'norm_B') (3, 'V_B') (3, 'norm_W') (3, 'N2_B') (3, 'out_B') (3, 'norm_B') (3, 'out_W') (3, 'norm_W') (3, 'int_B') (3, 'out_B') (3, 'int_W') (3, 'out_W') (3, 'N2_B') (3, 'int_B') (3, 'N2_W') (3, 'N2_W') (3, 'O_B') (3, 'int_W') (3, 'O_W') (3, 'O_B') (3, 'V_B') (3, 'V_W') (3, 'V_W') (3, 'O_W') checking hidden_state : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.007053470332695724 x = [-0.033 -0.06775 -0.0403 ... -0.03784 0.0542 0.004234] y = [-0.03326 -0.0679 -0.04016 ... -0.0382 0.05423 0.004383] 160.00163143285042 160.00067853389154 -------------------------------------------------------------------------------- checking norm_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 20.93143768310547 x = [-361.2 -172.1 102.44 ... -177.2 24.77 122.5 ] y = [-361. -172.1 102.44 ... -177.2 24.84 122.5 ] 5226.907391443802 5225.958379654771 -------------------------------------------------------------------------------- checking norm_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 51.19789062500001 x = [690.5 306.8 234.9 ... 309.8 194.1 249.9] y = [690. 306.8 235. ... 309.8 194.2 249.9] 10922.390728882345 10921.50617331465 -------------------------------------------------------------------------------- checking out_B : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.34591917777061465 x = [-2.523 0.689 -0.883 ... -4.66 -0.506 -2.338] y = [-2.525 0.6904 -0.8857 ... -4.656 -0.506 -2.338 ] 86.89474827757213 86.89665554886183 -------------------------------------------------------------------------------- checking out_W : tensor([], device='cuda:0', dtype=torch.int64) tensor([], device='cuda:0', dtype=torch.int64) tolerance is 0.1648080616052775 x = [-0.581 -0.01941 -1.994 ... -0.3518 -0.3577 0.1224 ] y = [-0.578 -0.01047 -2.016 ... -0.3584 -0.3677 0.1184 ] 1790.757210015001 1790.7617294409267 -------------------------------------------------------------------------------- ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:32:45.421241 508438 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:32:45.421236 508309 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:32:45.421564 508309 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-5: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 284, in test_backward run_backward(ds_config, seq_len, atol=atol, verbose=True) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 243, in run_backward check_equal(base_grads, ds_grads, atol=atol, verbose=verbose) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_backward.py", line 82, in check_equal np.testing.assert_allclose(x, y, err_msg="Index: {}".format(i), atol=tolerance) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1531, in assert_allclose verbose=verbose, header=header, equal_nan=equal_nan) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=0.164808 Index: 0 Mismatched elements: 132 / 2560000 (0.00516%) Max absolute difference: 0.2695 Max relative difference: 64288. x: array([-0.581 , -0.01941, -1.994 , ..., -0.3518 , -0.3577 , 0.1224 ], dtype=float16) y: array([-0.578 , -0.01047, -2.016 , ..., -0.3584 , -0.3677 , 0.1184 ], dtype=float16) =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-160-128-2-24-False-True-0.2] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-160-128-2-24-False-True-0.2] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 16.83s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-1600-128-2-4-False-True-0.2] 11.99s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-160-128-2-24-False-True-0.2] 10.72s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-2-3-True-True-0.05] 10.53s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-25-3-True-True-0.05] 8.82s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-160-128-2-3-True-True-0.1] (10 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-160-128-2-24-False-True-0.2] FAILED unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-25-3-True-True-0.05] FAILED unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-2-3-True-True-0.05] FAILED unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-1600-128-2-4-False-True-0.2] =================== 4 failed, 1 passed, 3 warnings in 59.97s =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1186562750 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 40 items / 35 deselected / 5 selected unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardStochastic::test_forward_stochastic[batch_size0-hidden_size0-seq_len0-heads0-num_layers0-is_preln0-use_fp160] SKIPPED [ 20%] unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-False-False] PASSED [ 40%] unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-True-True] FAILED [ 60%] unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-True-False] PASSED [ 80%] unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-False-True] FAILED [100%] =================================== FAILURES =================================== _ TestCUDAForwardSmallBatchSize.test_forward_with_small_bsz[8-7-1024-512-16-3-True-True] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. [2023-05-27 03:33:15,311] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl layer #0 is created with date type [half]. layer #1 is created with date type [half]. layer #2 is created with date type [half]. DeepSpeed Transformer config is {'layer_id': 0, 'batch_size': 8, 'hidden_size': 1024, 'intermediate_size': 4096, 'heads': 16, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 1, 'batch_size': 8, 'hidden_size': 1024, 'intermediate_size': 4096, 'heads': 16, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 2, 'batch_size': 8, 'hidden_size': 1024, 'intermediate_size': 4096, 'heads': 16, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:33:15.314390 509034 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:33:15.314390 508905 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:33:15.314916 508905 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-2: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_forward.py", line 278, in test_forward_with_small_bsz run_forward(ds_config, seq_len, atol=3e-2, test_bsz=small_bsz) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_forward.py", line 181, in run_forward check_equal(base_results, ds_results, atol=atol, verbose=verbose) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_forward.py", line 29, in check_equal np.testing.assert_allclose(x, y, err_msg="Index: {}".format(i), atol=atol) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1531, in assert_allclose verbose=verbose, header=header, equal_nan=equal_nan) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=0.03 Index: 0 Mismatched elements: 244 / 524288 (0.0465%) Max absolute difference: 0.05078 Max relative difference: 725. x: array([[-0.517 , -0.2764 , -0.432 , ..., -0.323 , -1.129 , 0.8687 ], [ 0.521 , -1.953 , 0.7583 , ..., 0.12384, -0.6025 , 0.03528], [-1.25 , -0.7314 , 2.188 , ..., 1.869 , 0.5596 , -1.695 ],... y: array([[-0.53 , -0.2673 , -0.4304 , ..., -0.3267 , -1.132 , 0.8594 ], [ 0.5176 , -1.9375 , 0.765 , ..., 0.1267 , -0.5977 , 0.0337 ], [-1.255 , -0.7354 , 2.18 , ..., 1.86 , 0.567 , -1.696 ],... _ TestCUDAForwardSmallBatchSize.test_forward_with_small_bsz[8-7-1024-512-16-3-False-True] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex. [2023-05-27 03:33:32,771] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl layer #0 is created with date type [half]. layer #1 is created with date type [half]. layer #2 is created with date type [half]. DeepSpeed Transformer config is {'layer_id': 0, 'batch_size': 8, 'hidden_size': 1024, 'intermediate_size': 4096, 'heads': 16, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 1, 'batch_size': 8, 'hidden_size': 1024, 'intermediate_size': 4096, 'heads': 16, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} DeepSpeed Transformer config is {'layer_id': 2, 'batch_size': 8, 'hidden_size': 1024, 'intermediate_size': 4096, 'heads': 16, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 3, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': False, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'layer_norm_eps': 1e-12, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': False, 'return_tuple': False} ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:33:32.775861 509448 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:33:32.775852 509319 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:33:32.776753 509319 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-4: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_forward.py", line 278, in test_forward_with_small_bsz run_forward(ds_config, seq_len, atol=3e-2, test_bsz=small_bsz) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_forward.py", line 181, in run_forward check_equal(base_results, ds_results, atol=atol, verbose=verbose) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/accelerators/test_accelerator_forward.py", line 29, in check_equal np.testing.assert_allclose(x, y, err_msg="Index: {}".format(i), atol=atol) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1531, in assert_allclose verbose=verbose, header=header, equal_nan=equal_nan) File "/usr/local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=0.03 Index: 0 Mismatched elements: 373 / 524288 (0.0711%) Max absolute difference: 0.0586 Max relative difference: 399.5 x: array([[-0.5894 , -0.2544 , -0.294 , ..., -0.2957 , -1.12 , 0.852 ], [ 0.5103 , -1.8955 , 0.6772 , ..., 0.1709 , -0.519 , -0.1586 ], [-1.063 , -0.699 , 2.176 , ..., 1.813 , 0.5283 , -1.787 ],... y: array([[-0.597 , -0.246 , -0.304 , ..., -0.2893 , -1.119 , 0.849 ], [ 0.5103 , -1.894 , 0.6826 , ..., 0.1744 , -0.511 , -0.1643 ], [-1.0625 , -0.7075 , 2.168 , ..., 1.816 , 0.5264 , -1.794 ],... =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-False-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-False-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 9.31s call unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-False-False] 8.92s call unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-False-True] 8.72s call unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-True-False] 8.72s call unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-True-True] (10 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-True-True] FAILED unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-False-True] ====== 2 failed, 2 passed, 1 skipped, 35 deselected, 3 warnings in 36.72s ====== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=313298420 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 11 items unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagradGPUError::test_cpu_adagrad_gpu_error FAILED [ 9%] unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt_sparse_embedding[32-64-16] FAILED [ 18%] unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[22] FAILED [ 27%] unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt_sparse_embedding[4096-262144-16] FAILED [ 36%] unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[127] FAILED [ 45%] unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt_sparse_embedding[512-4096-16] FAILED [ 54%] unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[1048576] FAILED [ 63%] unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[30000000] FAILED [ 72%] unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[1024] FAILED [ 81%] unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[64] FAILED [ 90%] unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[55] FAILED [100%] =================================== FAILURES =================================== ______________ TestCPUAdagradGPUError.test_cpu_adagrad_gpu_error _______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:33:45,842] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:33:46.748474 509638 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:33:46.748492 509893 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:33:46.753865 509895 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:33:46.753861 509637 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:33:46.754478 509637 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:33:46.759402 509638 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-1: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 121, in test_cpu_adagrad_gpu_error optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' Process Process-2: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 121, in test_cpu_adagrad_gpu_error optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' ________ TestCPUAdagrad.test_cpu_adagrad_opt_sparse_embedding[32-64-16] ________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:33:51,059] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:33:51.061327 510050 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:33:51.061318 509921 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:33:51.061908 509921 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-3: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 101, in test_cpu_adagrad_opt_sparse_embedding optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' ___________________ TestCPUAdagrad.test_cpu_adagrad_opt[22] ____________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:33:55,296] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:33:55.298342 510193 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:33:55.298342 510064 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:33:55.298848 510064 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-4: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 55, in test_cpu_adagrad_opt optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' _____ TestCPUAdagrad.test_cpu_adagrad_opt_sparse_embedding[4096-262144-16] _____ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:33:59,482] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:33:59.484747 510336 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:33:59.484728 510207 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:33:59.485163 510207 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-5: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 101, in test_cpu_adagrad_opt_sparse_embedding optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' ___________________ TestCPUAdagrad.test_cpu_adagrad_opt[127] ___________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:34:03,835] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:34:03.837939 510479 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:34:03.837936 510350 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:34:03.838388 510350 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-6: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 55, in test_cpu_adagrad_opt optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' ______ TestCPUAdagrad.test_cpu_adagrad_opt_sparse_embedding[512-4096-16] _______ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:34:08,034] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:34:08.036928 510622 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:34:08.036928 510493 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:34:08.037411 510493 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-7: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 101, in test_cpu_adagrad_opt_sparse_embedding optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' _________________ TestCPUAdagrad.test_cpu_adagrad_opt[1048576] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:34:12,355] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:34:12.357167 510765 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:34:12.357151 510636 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:34:12.357631 510636 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-8: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 55, in test_cpu_adagrad_opt optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' ________________ TestCPUAdagrad.test_cpu_adagrad_opt[30000000] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:34:16,526] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:34:16.527958 510779 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:34:16.527971 510908 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:34:16.528316 510779 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-9: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 55, in test_cpu_adagrad_opt optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' __________________ TestCPUAdagrad.test_cpu_adagrad_opt[1024] ___________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:34:21,399] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:34:21.401851 511051 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:34:21.401851 510922 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:34:21.402338 510922 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-10: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 55, in test_cpu_adagrad_opt optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' ___________________ TestCPUAdagrad.test_cpu_adagrad_opt[64] ____________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:34:25,506] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:34:25.507804 511194 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:34:25.507797 511065 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:34:25.508117 511065 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-11: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 55, in test_cpu_adagrad_opt optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' ___________________ TestCPUAdagrad.test_cpu_adagrad_opt[55] ____________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:34:29,739] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:34:29.741122 511208 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:34:29.741135 511337 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:34:29.741436 511208 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. Process Process-12: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/ops/adagrad/test_cpu_adagrad.py", line 55, in test_cpu_adagrad_opt optimizer = DeepSpeedCPUAdagrad([param]) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 22, in __init__ self.ds_opt_adagrad = CPUAdagradBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adagrad/cpu_adagrad.py", line 29, in __del__ self.ds_opt_adagrad.destroy_adagrad(self.opt_id) AttributeError: 'DeepSpeedCPUAdagrad' object has no attribute 'ds_opt_adagrad' =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagradGPUError::test_cpu_adagrad_gpu_error /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagradGPUError::test_cpu_adagrad_gpu_error /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.42s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagradGPUError::test_cpu_adagrad_gpu_error 4.81s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[30000000] 4.31s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt_sparse_embedding[512-4096-16] 4.31s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt_sparse_embedding[4096-262144-16] 4.21s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt_sparse_embedding[32-64-16] 4.21s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[64] 4.21s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[1048576] 4.21s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[127] 4.21s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[55] 4.21s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[1024] 4.21s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[22] (22 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagradGPUError::test_cpu_adagrad_gpu_error FAILED unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt_sparse_embedding[32-64-16] FAILED unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[22] FAILED unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt_sparse_embedding[4096-262144-16] FAILED unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[127] FAILED unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt_sparse_embedding[512-4096-16] FAILED unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[1048576] FAILED unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[30000000] FAILED unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[1024] FAILED unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[64] FAILED unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt[55] ======================= 11 failed, 3 warnings in 49.81s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=4010334153 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 26 items unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype1-sdd-False-True] SKIPPED [ 3%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[32-dtype14-dds-False-False] SKIPPED [ 7%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype5-dsd-True-False] SKIPPED [ 11%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype9-sdd-False-False] SKIPPED [ 15%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype2-dds-False-False] SKIPPED [ 19%] unit/ops/sparse_attention/test_sparse_attention.py::test_softmax[dtype1-256-16] SKIPPED [ 23%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[32-dtype12-sdd-False-False] SKIPPED [ 26%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype4-dsd-False-False] SKIPPED [ 30%] unit/ops/sparse_attention/test_sparse_attention.py::test_softmax[dtype0-256-16] SKIPPED [ 34%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype11-dds-False-False] SKIPPED [ 38%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype10-dsd-False-False] SKIPPED [ 42%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype6-sdd-False-False] SKIPPED [ 46%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[64-dtype15-sdd-False-False] SKIPPED [ 50%] unit/ops/sparse_attention/test_sparse_attention.py::test_softmax[dtype1-576-32] SKIPPED [ 53%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype8-dds-False-False] SKIPPED [ 57%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[32-dtype13-dsd-False-False] SKIPPED [ 61%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype0-sdd-False-False] SKIPPED [ 65%] unit/ops/sparse_attention/test_sparse_attention.py::test_softmax[dtype1-256-32] SKIPPED [ 69%] unit/ops/sparse_attention/test_sparse_attention.py::test_softmax[dtype0-576-16] SKIPPED [ 73%] unit/ops/sparse_attention/test_sparse_attention.py::test_softmax[dtype1-576-16] SKIPPED [ 76%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[64-dtype16-dsd-False-False] SKIPPED [ 80%] unit/ops/sparse_attention/test_sparse_attention.py::test_softmax[dtype0-576-32] SKIPPED [ 84%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype7-dsd-False-False] SKIPPED [ 88%] unit/ops/sparse_attention/test_sparse_attention.py::test_softmax[dtype0-256-32] SKIPPED [ 92%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[64-dtype17-dds-False-False] SKIPPED [ 96%] unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype3-dds-False-True] SKIPPED [100%] =============================== warnings summary =============================== unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype1-sdd-False-True] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/ops/sparse_attention/test_sparse_attention.py::test_matmul[16-dtype1-sdd-False-True] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (78 durations < 1s hidden. Use -vv to show these durations.) ======================= 26 skipped, 2 warnings in 1.34s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1845239665 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 0 items / 1 skipped =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ======================== 1 skipped, 1 warning in 1.55s ========================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3712586262 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 6 items unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_some_overflow PASSED [ 16%] unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_all_overflow PASSED [ 33%] unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_no_overflow PASSED [ 50%] unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_no_overflow PASSED [ 66%] unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_all_overflow PASSED [ 83%] unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_some_overflow PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_some_overflow /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_some_overflow /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.14s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_some_overflow 4.91s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_all_overflow 4.91s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_no_overflow 4.91s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_some_overflow 4.91s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_all_overflow 4.91s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_no_overflow (12 durations < 1s hidden. Use -vv to show these durations.) ======================== 6 passed, 3 warnings in 30.62s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2286419646 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 16 items unit/runtime/half_precision/test_bf16.py::TestZeroAllowUntestedOptimizer::test SKIPPED [ 6%] unit/runtime/half_precision/test_bf16.py::TestZeroEmptyPartition::test SKIPPED [ 12%] unit/runtime/half_precision/test_bf16.py::TestZero2ReduceScatterOff::test SKIPPED [ 18%] unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[default-fp32] PASSED [ 25%] unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[bfp16-fp16] SKIPPED [ 31%] unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[bfp16-fp32] SKIPPED [ 37%] unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[fp16-fp32] PASSED [ 43%] unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[default-fp16] PASSED [ 50%] unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[fp16-fp16] PASSED [ 56%] unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[default-bfp16] SKIPPED [ 62%] unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[fp16-bfp16] SKIPPED [ 68%] unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[bfp16-bfp16] SKIPPED [ 75%] unit/runtime/half_precision/test_bf16.py::TestAdamBF16ZeroOneCycleCompatibility::test SKIPPED [ 81%] unit/runtime/half_precision/test_bf16.py::TestZeroSupportedClientOptimizer::test[FusedAdam] SKIPPED [ 87%] unit/runtime/half_precision/test_bf16.py::TestZeroSupportedClientOptimizer::test[Adam] SKIPPED [ 93%] unit/runtime/half_precision/test_bf16.py::TestZeroEmptyGrad::test SKIPPED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/half_precision/test_bf16.py::TestZeroAllowUntestedOptimizer::test /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/half_precision/test_bf16.py::TestZeroAllowUntestedOptimizer::test /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 8.42s call unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[fp16-fp16] 8.02s call unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[default-fp16] 7.92s call unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[default-fp32] 7.72s call unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[fp16-fp32] 4.81s call unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[fp16-bfp16] 4.71s call unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[default-bfp16] 4.61s call unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[bfp16-fp16] 4.07s call unit/runtime/half_precision/test_bf16.py::TestZeroAllowUntestedOptimizer::test 3.91s call unit/runtime/half_precision/test_bf16.py::TestZeroEmptyPartition::test 3.81s call unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[bfp16-fp32] 3.81s call unit/runtime/half_precision/test_bf16.py::TestZeroDtypeCocktail::test[bfp16-bfp16] 3.81s call unit/runtime/half_precision/test_bf16.py::TestZero2ReduceScatterOff::test 3.61s call unit/runtime/half_precision/test_bf16.py::TestZeroSupportedClientOptimizer::test[Adam] 3.61s call unit/runtime/half_precision/test_bf16.py::TestAdamBF16ZeroOneCycleCompatibility::test 3.61s call unit/runtime/half_precision/test_bf16.py::TestZeroEmptyGrad::test 3.51s call unit/runtime/half_precision/test_bf16.py::TestZeroSupportedClientOptimizer::test[FusedAdam] (32 durations < 1s hidden. Use -vv to show these durations.) ============= 4 passed, 12 skipped, 3 warnings in 81.00s (0:01:21) ============= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3358828303 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 59 items unit/runtime/half_precision/test_fp16.py::TestLambFP32GradClip::test PASSED [ 1%] unit/runtime/half_precision/test_fp16.py::TestLambFP16::test__basic PASSED [ 3%] unit/runtime/half_precision/test_fp16.py::TestLambFP16::test_empty_grad PASSED [ 5%] unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-2] PASSED [ 6%] unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[True-3] FAILED [ 8%] unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-3] PASSED [ 10%] unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[True-2] FAILED [ 11%] unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[True-1] FAILED [ 13%] unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-1] PASSED [ 15%] unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[False-Adam] PASSED [ 16%] unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[True-AdamW] PASSED [ 18%] unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[True-Adam] PASSED [ 20%] unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[False-AdamW] PASSED [ 22%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-False-2] PASSED [ 23%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-True-2] FAILED [ 25%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-True-3] FAILED [ 27%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-False-1] PASSED [ 28%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-True-2] FAILED [ 30%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-False-2] PASSED [ 32%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-False-3] PASSED [ 33%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-True-3] FAILED [ 35%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-False-3] PASSED [ 37%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-False-1] PASSED [ 38%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-True-1] FAILED [ 40%] unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-True-1] FAILED [ 42%] unit/runtime/half_precision/test_fp16.py::TestAdamwFP16EmptyGrad::test PASSED [ 44%] unit/runtime/half_precision/test_fp16.py::TestZero3LazyScatter::test PASSED [ 45%] unit/runtime/half_precision/test_fp16.py::TestAdamFP32EmptyGrad::test PASSED [ 47%] unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[False-3] PASSED [ 49%] unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[False-2] PASSED [ 50%] unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[True-3] PASSED [ 52%] unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[True-1] PASSED [ 54%] unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[True-2] PASSED [ 55%] unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[False-1] PASSED [ 57%] unit/runtime/half_precision/test_fp16.py::TestZero2ReduceScatterOff::test PASSED [ 59%] unit/runtime/half_precision/test_fp16.py::TestFP16OptimizerForMoE::test_fused_gradnorm PASSED [ 61%] unit/runtime/half_precision/test_fp16.py::TestFP16OptimizerForMoE::test_lamb_gradnorm[False] PASSED [ 62%] unit/runtime/half_precision/test_fp16.py::TestFP16OptimizerForMoE::test_unfused_gradnorm PASSED [ 64%] unit/runtime/half_precision/test_fp16.py::TestFP16OptimizerForMoE::test_lamb_gradnorm[True] PASSED [ 66%] unit/runtime/half_precision/test_fp16.py::TestZeroEmptyGrad::test[1] PASSED [ 67%] unit/runtime/half_precision/test_fp16.py::TestZeroEmptyGrad::test[3] PASSED [ 69%] unit/runtime/half_precision/test_fp16.py::TestZeroEmptyGrad::test[2] PASSED [ 71%] unit/runtime/half_precision/test_fp16.py::TestAdamwFP16Basic::test PASSED [ 72%] unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[True-2] FAILED [ 74%] unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[False-1] PASSED [ 76%] unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[False-2] PASSED [ 77%] unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[False-3] SKIPPED [ 79%] unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[True-3] SKIPPED [ 81%] unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[True-1] FAILED [ 83%] unit/runtime/half_precision/test_fp16.py::TestAmp::test_adam_basic SKIPPED [ 84%] unit/runtime/half_precision/test_fp16.py::TestAmp::test_adam_O2 SKIPPED [ 86%] unit/runtime/half_precision/test_fp16.py::TestAmp::test_adam_O2_empty_grad SKIPPED [ 88%] unit/runtime/half_precision/test_fp16.py::TestAmp::test_lamb_basic SKIPPED [ 89%] unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[Adam-2] PASSED [ 91%] unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-1] PASSED [ 93%] unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[Adam-1] PASSED [ 94%] unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-3] PASSED [ 96%] unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[Adam-3] PASSED [ 98%] unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-2] PASSED [100%] =================================== FAILURES =================================== ______________ TestAdamFP16ZeroOneCycleCompatibility.test[True-3] ______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:37:19,162] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:37:19,366] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:37:19,367] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:37:19,407] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:37:19.164428 518095 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:37:19.164423 517966 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:37:19.164863 517966 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:37:19.371161 517966 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:37:19.371181 518108 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-8: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 321, in test model, _, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' ______________ TestAdamFP16ZeroOneCycleCompatibility.test[True-2] ______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:37:32,822] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:37:32,996] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:37:32,998] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:37:33,029] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:37:32.825738 518411 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:37:32.825716 518282 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:37:32.826485 518282 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:37:33.002050 518282 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:37:33.002058 518424 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-10: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 321, in test model, _, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' ______________ TestAdamFP16ZeroOneCycleCompatibility.test[True-1] ______________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:37:37,331] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:37:37,504] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:37:37,505] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:37:37,545] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:37:37.333837 518570 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:37:37.333824 518441 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:37:37.334368 518441 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:37:37.509450 518441 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:37:37.509456 518583 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-11: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 321, in test model, _, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' ______________________ TestZeroStaticScale.test[9-True-2] ______________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:38:30,785] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:38:30,968] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:38:30,970] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:38:31,014] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:38:30.787746 519542 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:38:30.787755 519671 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:38:30.788110 519542 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:38:30.974079 519684 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:38:30.974077 519542 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-18: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 359, in test model, optim, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' ______________________ TestZeroStaticScale.test[9-True-3] ______________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:38:35,366] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:38:35,567] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:38:35,568] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:38:35,608] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:38:35.368477 519830 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:38:35.368474 519701 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:38:35.368927 519701 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:38:35.571813 519843 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:38:35.571812 519701 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-19: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 359, in test model, optim, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _____________________ TestZeroStaticScale.test[10-True-2] ______________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:38:48,524] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:38:48,703] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:38:48,704] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:38:48,742] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:38:48.526738 520146 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:38:48.526731 520017 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:38:48.527318 520017 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:38:48.707561 520017 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:38:48.707572 520159 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-21: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 359, in test model, optim, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _____________________ TestZeroStaticScale.test[10-True-3] ______________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:39:10,675] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:39:10,842] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:39:10,843] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:39:10,881] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:39:10.679085 520619 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:39:10.679075 520490 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:39:10.679837 520490 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:39:10.846570 520490 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:39:10.846585 520632 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-24: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 359, in test model, optim, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _____________________ TestZeroStaticScale.test[10-True-1] ______________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:39:32,731] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:39:32,908] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:39:32,909] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:39:32,949] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:39:32.733839 521092 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:39:32.733829 520963 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:39:32.734154 520963 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:39:32.913554 520963 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:39:32.913571 521105 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-27: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 359, in test model, optim, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' ______________________ TestZeroStaticScale.test[9-True-1] ______________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:39:37,159] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:39:37,362] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:39:37,364] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:39:37,411] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:39:37.161310 521251 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:39:37.161303 521122 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:39:37.161666 521122 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:39:37.368073 521264 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:39:37.368073 521122 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-28: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 359, in test model, optim, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _____________________ TestZeroEmptyPartition.test[True-2] ______________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:41:57,482] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:41:58,059] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:41:58,061] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:41:58,071] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:41:58,073] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:41:58,149] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:41:57.693428 525431 ProcessGroupNCCL.cpp:601] [Rank 2] NCCL watchdog thread started! I0527 03:41:57.693413 525050 ProcessGroupNCCL.cpp:500] [Rank 2] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:41:57.755254 525433 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:41:57.755221 525049 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:41:57.757308 525048 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:41:57.757320 525435 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:41:57.757685 525048 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:41:57.765110 525050 ProcessGroupNCCL.cpp:1669] Rank 2 using GPU 2 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:41:57.766816 525049 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:41:58.063128 525470 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:41:58.063128 525049 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:41:58.067677 525471 ProcessGroupNCCL.cpp:601] [Rank 2] NCCL watchdog thread started! I0527 03:41:58.067672 525050 ProcessGroupNCCL.cpp:500] [Rank 2] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:41:58.078181 525472 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:41:58.078181 525048 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-53: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 443, in test model, _, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-54: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 443, in test model, _, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-55: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 443, in test model, _, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _____________________ TestZeroEmptyPartition.test[True-1] ______________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:42:29,866] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:42:31,046] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:42:31,047] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:42:31,047] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:42:31,060] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:42:31,120] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:42:30.738854 527682 ProcessGroupNCCL.cpp:601] [Rank 2] NCCL watchdog thread started! I0527 03:42:30.738853 527301 ProcessGroupNCCL.cpp:500] [Rank 2] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:42:30.748435 527300 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:42:30.748446 527684 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:42:30.757237 527299 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:42:30.757248 527686 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:42:30.757745 527299 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:42:30.759238 527300 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:42:30.762483 527301 ProcessGroupNCCL.cpp:1669] Rank 2 using GPU 2 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:42:31.051554 527721 ProcessGroupNCCL.cpp:601] [Rank 2] NCCL watchdog thread started! I0527 03:42:31.051553 527301 ProcessGroupNCCL.cpp:500] [Rank 2] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:42:31.051676 527299 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:42:31.051681 527722 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:42:31.064446 527723 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:42:31.064442 527300 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-70: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 443, in test model, _, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-69: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 443, in test model, _, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-68: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/half_precision/test_fp16.py", line 443, in test model, _, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/half_precision/test_fp16.py::TestLambFP32GradClip::test /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/half_precision/test_fp16.py::TestLambFP32GradClip::test /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 13.23s call unit/runtime/half_precision/test_fp16.py::TestZero2ReduceScatterOff::test 10.73s call unit/runtime/half_precision/test_fp16.py::TestFP16OptimizerForMoE::test_lamb_gradnorm[True] 10.03s call unit/runtime/half_precision/test_fp16.py::TestFP16OptimizerForMoE::test_fused_gradnorm 9.92s call unit/runtime/half_precision/test_fp16.py::TestLambFP16::test__basic 9.83s call unit/runtime/half_precision/test_fp16.py::TestAdamFP32EmptyGrad::test 9.62s call unit/runtime/half_precision/test_fp16.py::TestFP16OptimizerForMoE::test_lamb_gradnorm[False] 9.62s call unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[False-2] 9.42s call unit/runtime/half_precision/test_fp16.py::TestLambFP16::test_empty_grad 9.32s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-False-3] 9.17s call unit/runtime/half_precision/test_fp16.py::TestLambFP32GradClip::test 9.13s call unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[False-1] 9.12s call unit/runtime/half_precision/test_fp16.py::TestFP16OptimizerForMoE::test_unfused_gradnorm 9.12s call unit/runtime/half_precision/test_fp16.py::TestZero3LazyScatter::test 9.12s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-False-3] 9.12s call unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-3] 8.92s call unit/runtime/half_precision/test_fp16.py::TestZeroEmptyGrad::test[3] 8.72s call unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-2] 8.42s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-False-1] 8.42s call unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-1] 8.42s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-False-2] 8.42s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-False-1] 8.22s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-False-2] 8.11s call unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[False-Adam] 8.02s call unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[True-AdamW] 8.02s call unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[False-AdamW] 8.01s call unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[True-Adam] 7.92s call unit/runtime/half_precision/test_fp16.py::TestZeroEmptyGrad::test[2] 7.72s call unit/runtime/half_precision/test_fp16.py::TestAdamwFP16Basic::test 7.52s call unit/runtime/half_precision/test_fp16.py::TestAdamwFP16EmptyGrad::test 7.52s call unit/runtime/half_precision/test_fp16.py::TestZeroEmptyGrad::test[1] 5.82s call unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[True-1] 5.61s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-3] 5.11s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-2] 5.01s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-1] 4.97s call unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[True-2] 4.71s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[Adam-3] 4.71s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-True-3] 4.61s call unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[False-3] 4.61s call unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[True-3] 4.61s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-True-2] 4.51s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-True-3] 4.51s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-True-1] 4.51s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-True-1] 4.51s call unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[True-1] 4.51s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-True-2] 4.41s call unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[True-2] 4.32s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[Adam-2] 4.32s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[Adam-1] 4.11s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[False-3] 4.11s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[False-2] 4.11s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[True-3] 4.01s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[False-1] 4.01s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[True-1] 4.01s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[True-2] 3.81s call unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[True-3] (118 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[True-3] FAILED unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[True-2] FAILED unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[True-1] FAILED unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-True-2] FAILED unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-True-3] FAILED unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-True-2] FAILED unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-True-3] FAILED unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[10-True-1] FAILED unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[9-True-1] FAILED unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[True-2] FAILED unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[True-1] ======= 11 failed, 42 passed, 6 skipped, 3 warnings in 383.76s (0:06:23) ======= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=65944379 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 3 items unit/runtime/pipe/test_pipe.py::TestPipeCifar10::test[topo_config0] SKIPPED [ 33%] unit/runtime/pipe/test_pipe.py::TestPipeCifar10::test[topo_config2] SKIPPED [ 66%] unit/runtime/pipe/test_pipe.py::TestPipeCifar10::test[topo_config1] SKIPPED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/pipe/test_pipe.py::TestPipeCifar10::test[topo_config0] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/pipe/test_pipe.py::TestPipeCifar10::test[topo_config0] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.61s call unit/runtime/pipe/test_pipe.py::TestPipeCifar10::test[topo_config0] 4.32s call unit/runtime/pipe/test_pipe.py::TestPipeCifar10::test[topo_config2] 4.22s call unit/runtime/pipe/test_pipe.py::TestPipeCifar10::test[topo_config1] (6 durations < 1s hidden. Use -vv to show these durations.) ======================= 3 skipped, 3 warnings in 15.11s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1874972076 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 9 items unit/runtime/pipe/test_topology.py::test_topology_2d PASSED [ 11%] unit/runtime/pipe/test_topology.py::test_topology_rank_repr PASSED [ 22%] unit/runtime/pipe/test_topology.py::test_topology_comm_list PASSED [ 33%] unit/runtime/pipe/test_topology.py::test_topology_dims PASSED [ 44%] unit/runtime/pipe/test_topology.py::test_topology_3d PASSED [ 55%] unit/runtime/pipe/test_topology.py::test_topology_match PASSED [ 66%] unit/runtime/pipe/test_topology.py::test_primes PASSED [ 77%] unit/runtime/pipe/test_topology.py::TestDistributedTopology::test_grid_pipe_data PASSED [ 88%] unit/runtime/pipe/test_topology.py::TestDistributedTopology::test_stage_to_global PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/pipe/test_topology.py::test_topology_2d /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/pipe/test_topology.py::test_topology_2d /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.90s call unit/runtime/pipe/test_topology.py::TestDistributedTopology::test_grid_pipe_data 5.62s call unit/runtime/pipe/test_topology.py::TestDistributedTopology::test_stage_to_global (25 durations < 1s hidden. Use -vv to show these durations.) ======================== 9 passed, 3 warnings in 12.52s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1818215355 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 17 items unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_firststage[8] PASSED [ 5%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_laststage[3] PASSED [ 11%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_stagequery PASSED [ 17%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_train_schedule_singlestage PASSED [ 23%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_midstage[10] PASSED [ 29%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_singlestage PASSED [ 35%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_laststage[1] PASSED [ 41%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_schedule_laststage PASSED [ 47%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_laststage[8] PASSED [ 52%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_laststage[10] PASSED [ 58%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_firststage[1] PASSED [ 64%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_midstage[3] PASSED [ 70%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_schedule_firststage PASSED [ 76%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_firststage[10] PASSED [ 82%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_firststage[3] PASSED [ 88%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_midstage[1] PASSED [ 94%] unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_midstage[8] PASSED [100%] =============================== warnings summary =============================== unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_firststage[8] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/pipe/test_pipe_schedule.py::test_pipe_inference_schedule_firststage[8] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (51 durations < 1s hidden. Use -vv to show these durations.) ======================== 17 passed, 2 warnings in 1.07s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3109965896 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 3 items unit/runtime/test_data_efficiency.py::TestDataEfficiency::test_curriculum_learning PASSED [ 33%] unit/runtime/test_data_efficiency.py::TestLegacyCurriculumScheduler::test_fixed_discrete PASSED [ 66%] unit/runtime/test_data_efficiency.py::TestLegacyCurriculumScheduler::test_fixed_linear PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/test_data_efficiency.py::TestDataEfficiency::test_curriculum_learning /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/test_data_efficiency.py::TestDataEfficiency::test_curriculum_learning /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 9.80s call unit/runtime/test_data_efficiency.py::TestDataEfficiency::test_curriculum_learning 8.63s call unit/runtime/test_data_efficiency.py::TestLegacyCurriculumScheduler::test_fixed_discrete 8.42s call unit/runtime/test_data_efficiency.py::TestLegacyCurriculumScheduler::test_fixed_linear (6 durations < 1s hidden. Use -vv to show these durations.) ======================== 3 passed, 3 warnings in 27.77s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=4157407483 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 20 items unit/runtime/test_ds_config_dict.py::test_gather_16bit_params_on_model_save[stage3_gather_16bit_weights_on_model_save] PASSED [ 5%] unit/runtime/test_ds_config_dict.py::test_get_bfloat16_enabled[bf16] PASSED [ 10%] unit/runtime/test_ds_config_dict.py::test_get_bfloat16_enabled[bfloat16] PASSED [ 15%] unit/runtime/test_ds_config_dict.py::test_gather_16bit_params_on_model_save[stage3_gather_fp16_weights_on_model_save] PASSED [ 20%] unit/runtime/test_ds_config_dict.py::test_temp_config_json PASSED [ 25%] unit/runtime/test_ds_config_dict.py::TestNoModel::test PASSED [ 30%] unit/runtime/test_ds_config_dict.py::TestArgs::test_no_args PASSED [ 35%] unit/runtime/test_ds_config_dict.py::TestArgs::test_none_args PASSED [ 40%] unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-33-17-2-False] PASSED [ 45%] unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-32-8-2-True] PASSED [ 50%] unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-32-16-1-True] PASSED [ 55%] unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-32-18-1-False] PASSED [ 60%] unit/runtime/test_ds_config_dict.py::TestInitNoOptimizer::test PASSED [ 65%] unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_dict PASSED [ 70%] unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_json PASSED [ 75%] unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_hjson PASSED [ 80%] unit/runtime/test_ds_config_dict.py::TestBasicConfig::test_accelerator PASSED [ 85%] unit/runtime/test_ds_config_dict.py::TestBasicConfig::test_check_version PASSED [ 90%] unit/runtime/test_ds_config_dict.py::TestDeprecatedDeepScaleConfig::test PASSED [ 95%] unit/runtime/test_ds_config_dict.py::TestDistInit::test PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/test_ds_config_dict.py::test_gather_16bit_params_on_model_save[stage3_gather_16bit_weights_on_model_save] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/test_ds_config_dict.py::test_gather_16bit_params_on_model_save[stage3_gather_16bit_weights_on_model_save] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 8.42s call unit/runtime/test_ds_config_dict.py::TestDistInit::test 8.42s call unit/runtime/test_ds_config_dict.py::TestArgs::test_none_args 8.32s call unit/runtime/test_ds_config_dict.py::TestArgs::test_no_args 7.92s call unit/runtime/test_ds_config_dict.py::TestDeprecatedDeepScaleConfig::test 7.61s call unit/runtime/test_ds_config_dict.py::TestInitNoOptimizer::test 5.21s call unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-32-8-2-True] 5.21s call unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-32-18-1-False] 5.01s call unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_json 4.81s call unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_hjson 4.81s call unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_dict 4.49s call unit/runtime/test_ds_config_dict.py::TestNoModel::test 4.31s call unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-33-17-2-False] 4.31s call unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-32-16-1-True] 4.11s call unit/runtime/test_ds_config_dict.py::TestBasicConfig::test_check_version 4.01s call unit/runtime/test_ds_config_dict.py::TestBasicConfig::test_accelerator (45 durations < 1s hidden. Use -vv to show these durations.) ================== 20 passed, 3 warnings in 88.05s (0:01:28) =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3723923125 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 64 items unit/runtime/test_ds_initialize.py::TestNoOptim::test[3] PASSED [ 1%] unit/runtime/test_ds_initialize.py::TestNoOptim::test[0] PASSED [ 3%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-zero2] PASSED [ 4%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-zero1] SKIPPED [ 6%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-amp] SKIPPED [ 7%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-None] SKIPPED [ 9%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-amp] SKIPPED [ 10%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-zero1] PASSED [ 12%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-zero2] PASSED [ 14%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-zero2] SKIPPED [ 15%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-zero2] PASSED [ 17%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-amp] SKIPPED [ 18%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-amp] SKIPPED [ 20%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-zero2] PASSED [ 21%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-zero1] PASSED [ 23%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-None] PASSED [ 25%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-zero1] SKIPPED [ 26%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-zero1] PASSED [ 28%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-amp] SKIPPED [ 29%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-amp] SKIPPED [ 31%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-zero2] PASSED [ 32%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-amp] SKIPPED [ 34%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-None] PASSED [ 35%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-amp] SKIPPED [ 37%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-zero2] SKIPPED [ 39%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-None] PASSED [ 40%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-zero1] SKIPPED [ 42%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-None] PASSED [ 43%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-zero1] SKIPPED [ 45%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-zero1] PASSED [ 46%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-amp] SKIPPED [ 48%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-zero1] PASSED [ 50%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-zero2] SKIPPED [ 51%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-None] SKIPPED [ 53%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-zero2] SKIPPED [ 54%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-zero2] PASSED [ 56%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-None] PASSED [ 57%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-zero1] PASSED [ 59%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-None] PASSED [ 60%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-None] SKIPPED [ 62%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-amp] SKIPPED [ 64%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-None] PASSED [ 65%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-None] SKIPPED [ 67%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-None] PASSED [ 68%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-amp] SKIPPED [ 70%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-zero1] PASSED [ 71%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-zero1] PASSED [ 73%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-amp] SKIPPED [ 75%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-zero2] PASSED [ 76%] unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-zero2] PASSED [ 78%] unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-None] PASSED [ 79%] unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[optimizer_type2-None] PASSED [ 81%] unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[optimizer_type2-_LRScheduler] PASSED [ 82%] unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[optimizer_type2-scheduler_type2] PASSED [ 84%] unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-scheduler_type2] PASSED [ 85%] unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-None] PASSED [ 87%] unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-scheduler_type2] PASSED [ 89%] unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-_LRScheduler] PASSED [ 90%] unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-_LRScheduler] PASSED [ 92%] unit/runtime/test_ds_initialize.py::TestConfigOptimizer::test[False] PASSED [ 93%] unit/runtime/test_ds_initialize.py::TestConfigOptimizer::test[True] PASSED [ 95%] unit/runtime/test_ds_initialize.py::TestClientOptimizer::test[optimizer_type2] PASSED [ 96%] unit/runtime/test_ds_initialize.py::TestClientOptimizer::test[Optimizer] PASSED [ 98%] unit/runtime/test_ds_initialize.py::TestClientOptimizer::test[None] PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/test_ds_initialize.py::TestNoOptim::test[3] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/test_ds_initialize.py::TestNoOptim::test[3] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 18.71s call unit/runtime/test_ds_initialize.py::TestNoOptim::test[3] 8.62s call unit/runtime/test_ds_initialize.py::TestNoOptim::test[0] 5.41s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-zero2] 5.41s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-zero2] 5.32s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-zero1] 5.11s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-None] 5.11s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-zero1] 5.01s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-zero2] 5.01s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-zero1] 5.01s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-zero1] 4.91s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-None] 4.91s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-zero2] 4.91s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-None] 4.91s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-zero1] 4.91s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-zero1] 4.91s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-zero2] 4.91s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-zero1] 4.81s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-zero2] 4.81s call unit/runtime/test_ds_initialize.py::TestConfigOptimizer::test[False] 4.81s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-zero2] 4.81s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-zero2] 4.81s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-None] 4.81s call unit/runtime/test_ds_initialize.py::TestClientOptimizer::test[None] 4.81s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-None] 4.81s call unit/runtime/test_ds_initialize.py::TestConfigOptimizer::test[True] 4.71s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-None] 4.71s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-zero1] 4.71s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-scheduler_type2] 4.71s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-None] 4.71s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-None] 4.71s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-None] 4.11s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[optimizer_type2-None] 4.11s call unit/runtime/test_ds_initialize.py::TestClientOptimizer::test[Optimizer] 4.11s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-_LRScheduler] 4.11s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[optimizer_type2-scheduler_type2] 4.11s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-_LRScheduler] 4.11s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-None] 4.01s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-scheduler_type2] 4.01s call unit/runtime/test_ds_initialize.py::TestClientOptimizer::test[optimizer_type2] 4.01s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[optimizer_type2-_LRScheduler] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-amp] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-amp] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-amp] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-amp] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-zero1] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-zero2] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-zero1] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-amp] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-amp] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-zero2] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-zero1] 3.61s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-amp] 3.51s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-amp] 3.51s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-None] 3.51s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-None] 3.51s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-zero2] 3.51s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-None] 3.51s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-zero1] 3.51s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-None] 3.51s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-amp] 3.51s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-amp] 3.51s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-amp] 3.41s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-zero2] 3.41s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-amp] (128 durations < 1s hidden. Use -vv to show these durations.) ============ 40 passed, 24 skipped, 3 warnings in 292.96s (0:04:52) ============ ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1745052021 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 2 items unit/runtime/test_multi_output_model.py::TestThreeOutputModel::test PASSED [ 50%] unit/runtime/test_multi_output_model.py::TestTwoOutputModel::test PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/test_multi_output_model.py::TestThreeOutputModel::test /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/test_multi_output_model.py::TestThreeOutputModel::test /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 8.47s call unit/runtime/test_multi_output_model.py::TestThreeOutputModel::test 8.01s call unit/runtime/test_multi_output_model.py::TestTwoOutputModel::test (4 durations < 1s hidden. Use -vv to show these durations.) ======================== 2 passed, 3 warnings in 17.40s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=74885201 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 9 items unit/runtime/test_pld.py::TestPLDModel::test_pld_model[1.0] PASSED [ 11%] unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0.1] PASSED [ 22%] unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0] PASSED [ 33%] unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0.9] PASSED [ 44%] unit/runtime/test_pld.py::test_pld_schedule[1.0] PASSED [ 55%] unit/runtime/test_pld.py::test_pld_schedule[0] PASSED [ 66%] unit/runtime/test_pld.py::test_pld_schedule[0.9] PASSED [ 77%] unit/runtime/test_pld.py::test_pld_schedule[0.1] PASSED [ 88%] unit/runtime/test_pld.py::TestNonPLDModel::test_non_pld_model PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/test_pld.py::TestPLDModel::test_pld_model[1.0] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/test_pld.py::TestPLDModel::test_pld_model[1.0] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 8.82s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0.1] 8.80s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[1.0] 8.32s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0.9] 8.12s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0] 4.91s call unit/runtime/test_pld.py::TestNonPLDModel::test_non_pld_model (22 durations < 1s hidden. Use -vv to show these durations.) ======================== 9 passed, 3 warnings in 39.91s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=4085701365 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 13 items unit/runtime/utils/test_partition.py::TestPartitionedTensor::test PASSED [ 7%] unit/runtime/utils/test_partition.py::test_valid_partition PASSED [ 15%] unit/runtime/utils/test_partition.py::test_float_balanced PASSED [ 23%] unit/runtime/utils/test_partition.py::test_float_lastheavy SKIPPED (...) [ 30%] unit/runtime/utils/test_partition.py::test_float_midheavy PASSED [ 38%] unit/runtime/utils/test_partition.py::test_easy_balance_uniform PASSED [ 46%] unit/runtime/utils/test_partition.py::test_balance_bert PASSED [ 53%] unit/runtime/utils/test_partition.py::test_easy_balance_balanced PASSED [ 61%] unit/runtime/utils/test_partition.py::test_short_partition PASSED [ 69%] unit/runtime/utils/test_partition.py::test_short_partition_uniform PASSED [ 76%] unit/runtime/utils/test_partition.py::test_int_balanced PASSED [ 84%] unit/runtime/utils/test_partition.py::test_prefix_sum PASSED [ 92%] unit/runtime/utils/test_partition.py::TestPartitionedTensorMeta::test PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/utils/test_partition.py::TestPartitionedTensor::test /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/utils/test_partition.py::TestPartitionedTensor::test /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.77s call unit/runtime/utils/test_partition.py::TestPartitionedTensor::test 4.72s call unit/runtime/utils/test_partition.py::TestPartitionedTensorMeta::test (36 durations < 1s hidden. Use -vv to show these durations.) ================== 12 passed, 1 skipped, 3 warnings in 11.44s ================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=4134026705 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 2 items unit/runtime/zero/test_ignore_unused_parameters.py::TestStage2IgnoreUnusedParameters::test[False] FAILED [ 50%] unit/runtime/zero/test_ignore_unused_parameters.py::TestStage2IgnoreUnusedParameters::test[True] FAILED [100%] =================================== FAILURES =================================== _________________ TestStage2IgnoreUnusedParameters.test[False] _________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:51:57,369] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:51:57,532] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:51:57,534] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:51:57,563] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:51:57.374886 548868 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:51:57.374883 548739 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:51:57.375814 548739 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:51:57.538480 548881 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:51:57.538480 548739 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-1: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_ignore_unused_parameters.py", line 47, in test model, _, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _________________ TestStage2IgnoreUnusedParameters.test[True] __________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:52:02,049] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:52:02,228] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:52:02,229] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter cpu_offload is deprecated use offload_optimizer instead [2023-05-27 03:52:02,270] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:52:02.051528 549027 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:52:02.051517 548898 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:52:02.052002 548898 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:52:02.232916 548898 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:52:02.232924 549040 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-2: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_ignore_unused_parameters.py", line 47, in test model, _, _, _ = deepspeed.initialize(config=config_dict, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/zero/test_ignore_unused_parameters.py::TestStage2IgnoreUnusedParameters::test[False] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/zero/test_ignore_unused_parameters.py::TestStage2IgnoreUnusedParameters::test[False] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.09s call unit/runtime/zero/test_ignore_unused_parameters.py::TestStage2IgnoreUnusedParameters::test[False] 4.81s call unit/runtime/zero/test_ignore_unused_parameters.py::TestStage2IgnoreUnusedParameters::test[True] (4 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/runtime/zero/test_ignore_unused_parameters.py::TestStage2IgnoreUnusedParameters::test[False] FAILED unit/runtime/zero/test_ignore_unused_parameters.py::TestStage2IgnoreUnusedParameters::test[True] ======================== 2 failed, 3 warnings in 10.97s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1358379984 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 522 items unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[tuple] PASSED [ 0%] unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[dict] PASSED [ 0%] unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[list] PASSED [ 0%] unit/runtime/zero/test_zero.py::TestZero3InitForParentWeightInitialization::test PASSED [ 0%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-1000-10000] PASSED [ 0%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-100-100] PASSED [ 1%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-100-10000] PASSED [ 1%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-1000-100] PASSED [ 1%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-1000-1000] PASSED [ 1%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-1000-1000] PASSED [ 1%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-100-1000] PASSED [ 2%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-1000-10000] PASSED [ 2%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-100-1000] PASSED [ 2%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-1000-100] PASSED [ 2%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-100-100] PASSED [ 2%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-100-10000] PASSED [ 3%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-True-0] SKIPPED [ 3%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-False-False-10] SKIPPED [ 3%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-True-False-0] SKIPPED [ 3%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-False-False-True-10] SKIPPED [ 3%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-False-0] SKIPPED [ 4%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-False-0] SKIPPED [ 4%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-False-0] SKIPPED [ 4%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-False-False-True-0] SKIPPED [ 4%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-False-10] SKIPPED [ 4%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-True-False-True-0] SKIPPED [ 4%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-True-False-True-10] SKIPPED [ 5%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-True-False-False-10] SKIPPED [ 5%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-False-0] SKIPPED [ 5%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-False-True-False-10] SKIPPED [ 5%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-True-False-10] SKIPPED [ 5%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-True-True-True-10] SKIPPED [ 6%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-True-10] SKIPPED [ 6%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-False-True-True-0] SKIPPED [ 6%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-False-False-True-10] SKIPPED [ 6%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-False-False-0] SKIPPED [ 6%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-True-True-True-10] SKIPPED [ 7%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-True-0] SKIPPED [ 7%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-True-10] SKIPPED [ 7%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-False-True-0] SKIPPED [ 7%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-False-10] SKIPPED [ 7%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-False-0] SKIPPED [ 8%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-False-False-0] SKIPPED [ 8%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-True-False-0] SKIPPED [ 8%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-True-0] SKIPPED [ 8%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-False-True-0] SKIPPED [ 8%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-False-10] SKIPPED [ 9%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-False-0] SKIPPED [ 9%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-10] SKIPPED [ 9%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-False-False-False-0] SKIPPED [ 9%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-False-True-10] SKIPPED [ 9%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-True-True-True-0] SKIPPED [ 9%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-False-False-False-10] SKIPPED [ 10%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-False-False-0] SKIPPED [ 10%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-True-10] SKIPPED [ 10%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-True-10] SKIPPED [ 10%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-False-True-0] SKIPPED [ 10%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-True-False-False-10] SKIPPED [ 11%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-False-10] SKIPPED [ 11%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-True-True-10] SKIPPED [ 11%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-False-0] SKIPPED [ 11%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-True-10] SKIPPED [ 11%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-False-0] SKIPPED [ 12%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-True-0] SKIPPED [ 12%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-False-True-True-10] SKIPPED [ 12%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-True-0] SKIPPED [ 12%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-True-True-False-0] SKIPPED [ 12%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-True-False-True-10] SKIPPED [ 13%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-True-10] SKIPPED [ 13%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-True-True-False-10] SKIPPED [ 13%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-False-10] SKIPPED [ 13%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-True-False-False-0] SKIPPED [ 13%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-True-True-True-0] SKIPPED [ 13%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-False-10] SKIPPED [ 14%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-True-True-False-10] SKIPPED [ 14%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-True-10] SKIPPED [ 14%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-True-True-10] SKIPPED [ 14%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-True-0] SKIPPED [ 14%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-False-10] SKIPPED [ 15%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-False-False-0] SKIPPED [ 15%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-False-False-0] SKIPPED [ 15%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-False-True-False-10] SKIPPED [ 15%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-True-True-0] SKIPPED [ 15%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-False-True-True-10] SKIPPED [ 16%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-False-False-False-0] SKIPPED [ 16%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-False-0] SKIPPED [ 16%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-True-10] SKIPPED [ 16%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-False-10] SKIPPED [ 16%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-False-0] SKIPPED [ 17%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-True-True-0] SKIPPED [ 17%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-True-False-False-10] SKIPPED [ 17%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-True-False-10] SKIPPED [ 17%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-False-False-True-10] SKIPPED [ 17%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-True-0] SKIPPED [ 18%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-False-True-True-0] SKIPPED [ 18%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-False-0] SKIPPED [ 18%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-True-10] SKIPPED [ 18%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-True-0] SKIPPED [ 18%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-False-False-False-0] SKIPPED [ 18%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-False-10] SKIPPED [ 19%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-True-0] SKIPPED [ 19%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-False-10] SKIPPED [ 19%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-False-10] SKIPPED [ 19%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-False-True-10] SKIPPED [ 19%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-False-True-0] SKIPPED [ 20%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-True-False-True-0] SKIPPED [ 20%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-False-True-False-0] SKIPPED [ 20%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-True-False-0] SKIPPED [ 20%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-False-10] SKIPPED [ 20%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-True-False-10] SKIPPED [ 21%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-False-10] SKIPPED [ 21%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-True-10] SKIPPED [ 21%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-True-False-True-0] SKIPPED [ 21%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-False-True-10] SKIPPED [ 21%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-False-False-10] SKIPPED [ 22%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-False-False-10] SKIPPED [ 22%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-10] SKIPPED [ 22%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-False-False-10] SKIPPED [ 22%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-False-False-False-10] SKIPPED [ 22%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-False-False-True-0] SKIPPED [ 22%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-False-True-10] SKIPPED [ 23%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-False-False-0] SKIPPED [ 23%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-True-10] SKIPPED [ 23%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-True-False-False-0] SKIPPED [ 23%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-False-True-10] SKIPPED [ 23%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-False-False-True-0] SKIPPED [ 24%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-True-10] SKIPPED [ 24%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-True-True-10] SKIPPED [ 24%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-True-0] SKIPPED [ 24%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-False-0] SKIPPED [ 24%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-True-False-False-10] SKIPPED [ 25%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-True-0] SKIPPED [ 25%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-False-False-10] SKIPPED [ 25%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-False-0] SKIPPED [ 25%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-False-10] SKIPPED [ 25%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-True-0] SKIPPED [ 26%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-True-False-False-0] SKIPPED [ 26%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-False-0] SKIPPED [ 26%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-False-0] SKIPPED [ 26%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-False-0] SKIPPED [ 26%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-True-True-10] SKIPPED [ 27%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-False-True-False-0] SKIPPED [ 27%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-True-True-0] SKIPPED [ 27%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-False-False-0] SKIPPED [ 27%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-False-False-10] SKIPPED [ 27%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-False-True-0] SKIPPED [ 27%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-True-False-True-10] SKIPPED [ 28%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-False-False-True-0] SKIPPED [ 28%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-False-True-10] SKIPPED [ 28%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-False-False-False-10] SKIPPED [ 28%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-False-10] SKIPPED [ 28%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-0] SKIPPED [ 29%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-False-False-False-0] SKIPPED [ 29%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-True-False-0] SKIPPED [ 29%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-False-10] SKIPPED [ 29%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-False-10] SKIPPED [ 29%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-True-10] SKIPPED [ 30%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-True-False-True-10] SKIPPED [ 30%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-True-10] SKIPPED [ 30%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-True-0] SKIPPED [ 30%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-True-0] SKIPPED [ 30%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-True-False-False-True-10] SKIPPED [ 31%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-False-True-0] SKIPPED [ 31%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-False-False-False-10] SKIPPED [ 31%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-True-10] SKIPPED [ 31%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-False-0] SKIPPED [ 31%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-True-False-True-0] SKIPPED [ 31%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-True-True-0] SKIPPED [ 32%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-False-False-10] SKIPPED [ 32%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Dict-True-True-False-False-0] SKIPPED [ 32%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-True-False-10] SKIPPED [ 32%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-False-True-0] SKIPPED [ 32%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-True-0] SKIPPED [ 33%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-False-True-10] SKIPPED [ 33%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-0] SKIPPED [ 33%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBaseBF16::test[EltwiseMultiplicationTestNetwork_List-False-True-True-False-0] SKIPPED [ 33%] unit/runtime/zero/test_zero.py::TestZeroUnbalancedGradients::test[2] PASSED [ 33%] unit/runtime/zero/test_zero.py::TestZeroUnbalancedGradients::test[1] PASSED [ 34%] unit/runtime/zero/test_zero.py::TestZeroUnbalancedGradients::test[3] PASSED [ 34%] unit/runtime/zero/test_zero.py::TestZeroOffloadOptim::test[True] PASSED [ 34%] unit/runtime/zero/test_zero.py::TestZeroOffloadOptim::test[False] PASSED [ 34%] unit/runtime/zero/test_zero.py::TestZero3RepeatForwardLoop::test PASSED [ 34%] unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[2] PASSED [ 35%] unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[1] PASSED [ 35%] unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[3] PASSED [ 35%] unit/runtime/zero/test_zero.py::TestZeroToFP32::test_1_param_group[True-3] PASSED [ 35%] unit/runtime/zero/test_zero.py::TestZeroToFP32::test_2_param_groups[True-3] PASSED [ 35%] unit/runtime/zero/test_zero.py::TestZeroToFP32::test_1_param_group[False-2] PASSED [ 36%] unit/runtime/zero/test_zero.py::TestZeroToFP32::test_2_param_groups[True-2] PASSED [ 36%] unit/runtime/zero/test_zero.py::TestZeroToFP32::test_2_param_groups[False-3] PASSED [ 36%] unit/runtime/zero/test_zero.py::TestZeroToFP32::test_2_param_groups[False-2] PASSED [ 36%] unit/runtime/zero/test_zero.py::TestZeroToFP32::test_1_param_group[False-3] PASSED [ 36%] unit/runtime/zero/test_zero.py::TestZeroToFP32::test_1_param_group[True-2] PASSED [ 36%] unit/runtime/zero/test_zero.py::TestZeroPartitionCache::test_training_partition_cache[True] PASSED [ 37%] unit/runtime/zero/test_zero.py::TestZeroPartitionCache::test_training_partition_cache[False] PASSED [ 37%] unit/runtime/zero/test_zero.py::TestZeroFrozenWeights::test PASSED [ 37%] unit/runtime/zero/test_zero.py::TestPartitionNcclAlignment::test PASSED [ 37%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningLargeParam::test[False] PASSED [ 37%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningLargeParam::test[True] PASSED [ 38%] unit/runtime/zero/test_zero.py::TestIncorectAllgatherBucketSize::test[1001] PASSED [ 38%] unit/runtime/zero/test_zero.py::TestIncorectAllgatherBucketSize::test[1000] PASSED [ 38%] unit/runtime/zero/test_zero.py::TestZeroOffloadStage1::test FAILED [ 38%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-False-True-True-0] PASSED [ 38%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-False-False-True-10] PASSED [ 39%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-False-False-False-True-0] PASSED [ 39%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-False-False-True-10] PASSED [ 39%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-True-False-0] PASSED [ 39%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-True-True-True-False-0] FAILED [ 39%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-False-False-10] PASSED [ 40%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-False-False-False-10] PASSED [ 40%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-False-True-10] PASSED [ 40%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-False-True-0] PASSED [ 40%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-False-False-True-10] PASSED [ 40%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-True-True-False-10] FAILED [ 40%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-False-False-False-0] PASSED [ 41%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-False-False-0] PASSED [ 41%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-True-True-False-10] FAILED [ 41%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-True-False-False-False-0] PASSED [ 41%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-False-False-False-10] PASSED [ 41%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-True-True-0] PASSED [ 42%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-True-True-10] PASSED [ 42%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-True-True-0] FAILED [ 42%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-False-True-10] PASSED [ 42%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-True-True-10] FAILED [ 42%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-False-0] FAILED [ 43%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-True-True-10] FAILED [ 43%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-False-False-True-0] PASSED [ 43%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-False-True-True-0] PASSED [ 43%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-True-True-True-0] FAILED [ 43%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-False-True-False-10] PASSED [ 44%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-True-True-0] PASSED [ 44%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-True-True-True-0] FAILED [ 44%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-True-False-False-True-0] PASSED [ 44%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-False-False-0] PASSED [ 44%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-True-True-10] PASSED [ 45%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-False-True-False-0] PASSED [ 45%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-True-False-False-0] PASSED [ 45%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-True-True-False-10] FAILED [ 45%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-False-True-0] PASSED [ 45%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-False-True-10] PASSED [ 45%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-True-False-True-True-0] PASSED [ 46%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-True-False-10] PASSED [ 46%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-True-True-0] FAILED [ 46%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-False-True-10] PASSED [ 46%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-True-False-0] FAILED [ 46%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-False-False-True-0] PASSED [ 47%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-True-False-0] FAILED [ 47%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-False-True-0] PASSED [ 47%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-True-True-0] FAILED [ 47%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-True-True-10] FAILED [ 47%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-True-False-0] PASSED [ 48%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-False-False-True-True-0] PASSED [ 48%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-False-False-0] PASSED [ 48%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-True-False-10] FAILED [ 48%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-True-False-0] FAILED [ 48%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-False-False-10] PASSED [ 49%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-False-True-10] PASSED [ 49%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-False-True-0] PASSED [ 49%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-False-True-False-10] PASSED [ 49%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-True-False-10] FAILED [ 49%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-False-True-False-10] PASSED [ 50%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-True-False-0] FAILED [ 50%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-True-False-True-0] PASSED [ 50%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-False-False-10] PASSED [ 50%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-False-True-True-10] PASSED [ 50%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-False-True-10] PASSED [ 50%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-False-False-False-0] PASSED [ 51%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-False-False-True-False-10] PASSED [ 51%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-True-10] FAILED [ 51%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-True-True-0] FAILED [ 51%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-False-True-False-0] PASSED [ 51%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-False-False-0] PASSED [ 52%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-True-False-False-10] PASSED [ 52%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-False-True-False-10] PASSED [ 52%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-False-10] FAILED [ 52%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-False-True-True-0] PASSED [ 52%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-True-False-10] FAILED [ 53%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-True-False-False-10] PASSED [ 53%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-True-0] FAILED [ 53%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-False-False-0] PASSED [ 53%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-True-False-False-False-10] PASSED [ 53%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-False-True-0] PASSED [ 54%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-True-0] FAILED [ 54%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-True-True-10] PASSED [ 54%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-True-True-False-True-10] PASSED [ 54%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-False-True-False-10] PASSED [ 54%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-True-False-0] FAILED [ 54%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-False-True-False-0] PASSED [ 55%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-True-False-True-10] PASSED [ 55%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-False-True-False-0] PASSED [ 55%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-False-False-0] PASSED [ 55%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-True-True-10] FAILED [ 55%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-True-False-0] PASSED [ 56%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-False-False-True-0] PASSED [ 56%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-True-True-False-10] FAILED [ 56%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-True-True-10] PASSED [ 56%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-False-False-False-0] PASSED [ 56%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-False-0] FAILED [ 57%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-False-False-0] PASSED [ 57%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-True-True-0] FAILED [ 57%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-False-True-10] PASSED [ 57%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-False-False-0] PASSED [ 57%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-False-True-10] PASSED [ 58%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-False-False-10] PASSED [ 58%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-False-False-True-10] PASSED [ 58%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-False-True-0] PASSED [ 58%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-False-True-10] PASSED [ 58%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-True-True-False-False-0] PASSED [ 59%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-False-False-0] PASSED [ 59%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-False-False-False-10] PASSED [ 59%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-True-False-10] PASSED [ 59%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-False-True-True-10] FAILED [ 59%] unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-False-False-0] =================================== FAILURES =================================== __________________________ TestZeroOffloadStage1.test __________________________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:58:45,270] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:58:45,570] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:58:45,619] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:58:45.287357 564294 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:58:45.287345 564039 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:58:45.293184 564296 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:58:45.293169 564038 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:58:45.293731 564038 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:58:45.298118 564039 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:58:45.564121 564320 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:58:45.564119 564039 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:58:45.576352 564321 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:58:45.576352 564038 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-98: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 1104, in test model, _, _, _ = deepspeed.initialize(model=model, model_parameters=model.parameters(), config=config_dict) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-99: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 1104, in test model, _, _, _ = deepspeed.initialize(model=model, model_parameters=model.parameters(), config=config_dict) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_List-True-True-True-True-False-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:59:23,340] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:59:24,299] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:59:24,354] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:59:24.026719 566154 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:59:24.026686 565899 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:59:24.029151 566156 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:59:24.029062 565898 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:59:24.030623 565898 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:59:24.039491 565899 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:59:24.303315 565899 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:59:24.303329 566180 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:59:24.305114 565898 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:59:24.305117 566181 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-111: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-110: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_List-False-False-True-True-False-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 03:59:56,713] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 03:59:57,967] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 03:59:58,017] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:59:57.701318 567933 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:59:57.701282 567678 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 03:59:57.709371 567935 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:59:57.709367 567677 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:59:57.711491 567677 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:59:57.714293 567678 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 03:59:57.971778 567677 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 03:59:57.971791 567959 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 03:59:57.973836 567960 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 03:59:57.973836 567678 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-122: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-123: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-True-True-False-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:00:15,622] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:00:16,137] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:00:16,199] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:00:15.793398 568839 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:00:15.793390 568584 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:00:15.800655 568841 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:00:15.800657 568583 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:00:15.801921 568583 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:00:15.804073 568584 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:00:16.136315 568584 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:00:16.136334 568865 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:00:16.156829 568583 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:00:16.156841 568866 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-128: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-129: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-True-True-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:00:47,658] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:00:48,022] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:00:48,082] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:00:47.726930 570135 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:00:47.726940 570390 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:00:47.733971 570392 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:00:47.733963 570134 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:00:47.734810 570134 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:00:47.738204 570135 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:00:48.032078 570416 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:00:48.032078 570135 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:00:48.032897 570134 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:00:48.032914 570417 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-138: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-139: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-True-True-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:00:59,895] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:01:00,975] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:01:01,035] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:01:00.687906 571015 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:01:00.687906 570759 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:01:00.688467 571016 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:01:00.688444 570758 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:00.689587 570758 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:01:00.699472 570759 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:01:00.984941 571040 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:01:00.984938 570758 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:00.994814 570759 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:00.994829 571041 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-143: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-142: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-False-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:01:05,830] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:01:06,197] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:01:06,258] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:01:05.896028 571329 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:01:05.896021 571074 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:01:05.903767 571331 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:01:05.903767 571073 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:05.904378 571073 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:01:05.908000 571074 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:01:06.203398 571073 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:06.203429 571355 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:01:06.215256 571074 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:06.215286 571356 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-145: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-144: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-True-True-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:01:11,328] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:01:12,649] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:01:12,725] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:01:12.329177 571389 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:12.329206 571644 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:01:12.338941 571388 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:12.338948 571646 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:01:12.340009 571388 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:01:12.340631 571389 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:01:12.662580 571670 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:01:12.662580 571388 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:12.675622 571389 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:12.675660 571671 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-147: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-146: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_List-True-False-True-True-True-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:01:32,059] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:01:32,376] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:01:32,433] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:01:32.120200 572577 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:01:32.120183 572322 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:01:32.122335 572579 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:01:32.122327 572321 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:32.122886 572321 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:01:32.131101 572322 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:01:32.383942 572321 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:32.383965 572603 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:01:32.394268 572322 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:32.394290 572604 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-152: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-153: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_List-False-False-True-True-True-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:01:50,054] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:01:50,422] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:01:50,483] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:01:50.159503 573510 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:01:50.159497 573255 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:01:50.167136 573512 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:01:50.167126 573254 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:50.167446 573254 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:01:50.170589 573255 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:01:50.429576 573536 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:01:50.429576 573254 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:50.441385 573255 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:01:50.441399 573537 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-159: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-158: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Tuple-True-True-True-True-False-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:02:24,670] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:02:24,899] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:02:24,944] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:02:24.680497 575316 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:02:24.680498 575061 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:02:24.682520 575318 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:02:24.682514 575060 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:02:24.682832 575060 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:02:24.692780 575061 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:02:24.903288 575060 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:02:24.903303 575342 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:02:24.904186 575061 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:02:24.904194 575343 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-170: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-171: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-True-True-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:02:50,787] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:02:51,233] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:02:51,284] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:02:50.970237 576814 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:02:50.970232 576558 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:02:50.970441 576815 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:02:50.970433 576557 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:02:50.970782 576557 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:02:50.971195 576558 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:02:51.242446 576839 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:02:51.242446 576558 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:02:51.242776 576840 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:02:51.242776 576557 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-180: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-181: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-True-False-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:03:00,844] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:03:01,157] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:03:01,201] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:03:00.876334 577411 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:03:00.876361 577412 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:00.876325 577155 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:00.876354 577154 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:00.876740 577154 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:00.881904 577155 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:01.159129 577155 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:01.159159 577436 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:03:01.161873 577437 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:01.161873 577154 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-185: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-184: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-True-False-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:03:11,629] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:03:12,833] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:03:12,885] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:03:12.548625 578034 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:03:12.548606 577779 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:03:12.554504 578036 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:12.554474 577778 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:12.555938 577778 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:12.561740 577779 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:12.838372 577778 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:12.838394 578060 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:12.838965 577779 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:12.838976 578061 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-188: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-189: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-True-True-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:03:23,241] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:03:24,312] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:03:24,377] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:03:24.036125 578403 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:24.036130 578658 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:03:24.044296 578402 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:24.044299 578660 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:24.045070 578402 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:24.046978 578403 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:24.320188 578684 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:24.320188 578402 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:24.331122 578685 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:03:24.331122 578403 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-192: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-193: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-True-True-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:03:28,656] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:03:29,122] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:03:29,187] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:03:28.857883 578973 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:03:28.857875 578718 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:03:28.860054 578975 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:28.860046 578717 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:28.860366 578717 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:28.868528 578718 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:29.129491 578999 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:03:29.129490 578718 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:29.140151 579000 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:29.140151 578717 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-194: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-195: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-True-False-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:03:51,175] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:03:52,198] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:03:52,243] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:03:51.978152 579933 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:51.978296 580188 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:03:51.986043 580190 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:51.986043 579932 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:51.986488 579932 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:51.989377 579933 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:52.203353 579932 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:52.203373 580214 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:52.204619 580215 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:03:52.204619 579933 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-203: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-202: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-True-False-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:03:56,433] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:03:56,917] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:03:56,972] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:03:56.658223 580503 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:03:56.658218 580248 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:03:56.666792 580505 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:56.666782 580247 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:56.667100 580247 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:56.669063 580248 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:03:56.920917 580529 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:03:56.920917 580247 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:56.923404 580248 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:03:56.923425 580530 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-205: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-204: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-True-False-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:04:25,273] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:04:26,531] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:04:26,574] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:04:26.270983 582054 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:04:26.270938 581799 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:04:26.276625 581798 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:04:26.276629 582056 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:04:26.277273 581798 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:04:26.283555 581799 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:04:26.534607 581799 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:04:26.534621 582080 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:04:26.538635 581798 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:04:26.538657 582081 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-215: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-214: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-True-False-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:04:37,605] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:04:37,923] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:04:37,969] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:04:37.654460 582423 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:04:37.654479 582678 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:04:37.659372 582680 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:04:37.659369 582422 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:04:37.659924 582422 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:04:37.665274 582423 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:04:37.926903 582704 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:04:37.926903 582423 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:04:37.929973 582705 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:04:37.929973 582422 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-219: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-218: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-True-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:05:16,298] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:05:16,746] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:05:16,799] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:05:16.488674 584793 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:05:16.488662 584538 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:05:16.494441 584795 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:05:16.494441 584537 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:05:16.494972 584537 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:05:16.499387 584538 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:05:16.753330 584819 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:05:16.753327 584537 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:05:16.753901 584820 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:05:16.753899 584538 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-232: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-233: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-True-True-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:05:21,003] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:05:21,499] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:05:21,546] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:05:21.216372 584853 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:05:21.216390 585108 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:05:21.217545 585110 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:05:21.217537 584852 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:05:21.217859 584852 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:05:21.227105 584853 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:05:21.506140 585134 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:05:21.506140 584852 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:05:21.506168 585135 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:05:21.506168 584853 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-235: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-234: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-False-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:05:48,651] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:05:48,952] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:05:48,997] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:05:48.691361 586350 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:05:48.691515 586605 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:05:48.695075 586607 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:05:48.695070 586349 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:05:48.695796 586349 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:05:48.702179 586350 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:05:48.956157 586631 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:05:48.956152 586349 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:05:48.957105 586632 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:05:48.957101 586350 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-244: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-245: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-True-False-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:05:59,139] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:05:59,392] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:05:59,444] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:05:59.161015 587229 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:05:59.161010 586974 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:05:59.169659 587231 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:05:59.169631 586973 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:05:59.172036 586973 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:05:59.181660 586974 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:05:59.394248 587255 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:05:59.394245 586974 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:05:59.410346 586973 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:05:59.410360 587256 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-249: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-248: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-True-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:06:09,194] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:06:10,245] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:06:10,303] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:06:09.964979 587571 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:06:09.965137 587826 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:06:09.974315 587828 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:06:09.974303 587570 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:06:09.975071 587570 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:06:09.975939 587571 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:06:10.252300 587852 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:06:10.252297 587570 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:06:10.254694 587571 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:06:10.254703 587853 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-252: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-253: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-True-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:06:32,250] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:06:32,711] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:06:32,758] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:06:32.450919 589068 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:06:32.450871 588813 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:06:32.455724 589070 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:06:32.455713 588812 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:06:32.456444 588812 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:06:32.462751 588813 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:06:32.720665 589094 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:06:32.720660 588813 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:06:32.730110 589095 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:06:32.730110 588812 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-260: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-261: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-True-False-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:06:54,845] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:06:55,197] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:06:55,249] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:06:54.953467 590283 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:06:54.953462 590028 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:06:54.960146 590285 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:06:54.960129 590027 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:06:54.960745 590027 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:06:54.964215 590028 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:06:55.201853 590309 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:06:55.201854 590027 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:06:55.203788 590028 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:06:55.203801 590310 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-268: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-269: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-True-True-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:07:21,343] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:07:21,698] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:07:21,759] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:07:21.438169 591780 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:07:21.438163 591525 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:07:21.446509 591782 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:07:21.446501 591524 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:07:21.446830 591524 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:07:21.448906 591525 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:07:21.710763 591525 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:07:21.710779 591806 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:07:21.718072 591524 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:07:21.718088 591807 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-278: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-279: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_Tuple-True-False-True-True-False-10] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:07:38,799] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:07:39,947] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:07:40,000] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:07:39.666981 592458 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:07:39.667003 592713 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:07:39.672473 592715 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:07:39.672473 592457 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:07:39.673085 592457 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:07:39.678056 592458 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:07:39.952809 592458 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:07:39.952826 592739 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:07:39.953127 592457 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:07:39.953148 592740 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! Process Process-285: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-284: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-False-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:07:56,075] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:07:56,533] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:07:56,584] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:07:56.272729 593646 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:07:56.272724 593391 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:07:56.280730 593390 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:07:56.280741 593648 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:07:56.281252 593390 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:07:56.283658 593391 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:07:56.537039 593672 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:07:56.537039 593390 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:07:56.539269 593391 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:07:56.539284 593673 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-290: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-291: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-True-True-0] _ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:08:08,041] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:08:09,262] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:08:09,325] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:08:09.013109 594014 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:08:09.013123 594271 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:08:09.015075 594272 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:08:09.015017 594015 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:08:09.023852 594014 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:08:09.024744 594015 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:08:09.281173 594015 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:08:09.281184 594296 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:08:09.282166 594297 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:08:09.282166 594014 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET Process Process-294: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-295: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 644, in test ds_engine = _ds_initialize_for_param_partitioning_testing(model, cfg) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero.py", line 416, in _ds_initialize_for_param_partitioning_testing ds_engine, _, _, _ = deepspeed.initialize(config=cfg, model=model, model_parameters=model.parameters()) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' _ TestZero3ParamPartitioningBase.test[EltwiseMultiplicationTestNetwork_List-True-False-False-True-True-10] _ Worker 0 killed by signal 11 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 04:09:13,842] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 04:09:14,140] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 04:09:14,207] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False [2023-05-27 04:09:14,894] [INFO] [logging.py:96:log_dist] [Rank 0] Using DeepSpeed Optimizer param name adam as basic optimizer [2023-05-27 04:09:14,894] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Basic Optimizer = FusedAdam [2023-05-27 04:09:14,894] [INFO] [utils.py:54:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= [2023-05-27 04:09:14,894] [INFO] [logging.py:96:log_dist] [Rank 0] Creating fp16 ZeRO stage 3 optimizer, MiCS is enabled False, Hierarchical params gather False [2023-05-27 04:09:14,894] [INFO] [logging.py:96:log_dist] [Rank 0] Creating torch.float16 ZeRO stage 3 optimizer [2023-05-27 04:09:14,961] [INFO] [utils.py:785:see_memory_usage] Stage 3 initialize beginning [2023-05-27 04:09:14,962] [INFO] [utils.py:789:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB [2023-05-27 04:09:14,963] [INFO] [utils.py:793:see_memory_usage] CPU Virtual Memory: used = 63.65 GB, percent = 12.6% [2023-05-27 04:09:14,963] [INFO] [stage3.py:113:__init__] Reduce bucket size 500000000 [2023-05-27 04:09:14,963] [INFO] [stage3.py:114:__init__] Prefetch bucket size 45 [2023-05-27 04:09:15,025] [INFO] [utils.py:785:see_memory_usage] DeepSpeedZeRoOffload initialize [begin] [2023-05-27 04:09:15,026] [INFO] [utils.py:789:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB [2023-05-27 04:09:15,026] [INFO] [utils.py:793:see_memory_usage] CPU Virtual Memory: used = 63.65 GB, percent = 12.6% ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:09:13.890677 597822 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:09:13.890674 597567 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 04:09:13.894685 597824 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 04:09:13.894678 597566 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:09:13.895030 597566 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:09:13.903522 597567 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 04:09:14.149610 597567 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:09:14.149629 597848 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 04:09:14.162525 597566 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 04:09:14.162544 597849 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[tuple] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[tuple] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 124.51s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-False-True-True-10] 18.55s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-1000-100] 18.46s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-1000-100] 18.45s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-1000-10000] 18.35s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-1000-1000] 17.55s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-1000-10000] 17.34s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-1000-1000] 11.33s call unit/runtime/zero/test_zero.py::TestZeroUnbalancedGradients::test[2] 10.82s call unit/runtime/zero/test_zero.py::TestZeroToFP32::test_1_param_group[True-3] 10.52s call unit/runtime/zero/test_zero.py::TestZeroToFP32::test_1_param_group[False-3] 10.12s call unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[3] 9.56s call unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[tuple] 9.53s call unit/runtime/zero/test_zero.py::TestZeroToFP32::test_1_param_group[False-2] 9.52s call unit/runtime/zero/test_zero.py::TestZeroToFP32::test_2_param_groups[False-3] 9.02s call unit/runtime/zero/test_zero.py::TestZeroFrozenWeights::test 9.02s call unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[list] 8.92s call unit/runtime/zero/test_zero.py::TestZeroToFP32::test_2_param_groups[False-2] 8.92s call unit/runtime/zero/test_zero.py::TestZeroToFP32::test_1_param_group[True-2] 8.92s call unit/runtime/zero/test_zero.py::TestZeroPartitionCache::test_training_partition_cache[True] 8.82s call unit/runtime/zero/test_zero.py::TestZeroToFP32::test_2_param_groups[True-3] 8.82s call unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[dict] 8.63s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-100-10000] 8.62s call unit/runtime/zero/test_zero.py::TestZero3RepeatForwardLoop::test 8.53s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-100-100] 8.53s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-100-100] 8.52s call unit/runtime/zero/test_zero.py::TestZeroUnbalancedGradients::test[3] 8.42s call unit/runtime/zero/test_zero.py::TestZeroToFP32::test_2_param_groups[True-2] 8.41s call unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[1] 8.32s call unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[2] 8.12s call unit/runtime/zero/test_zero.py::TestZeroUnbalancedGradients::test[1] 7.93s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-100-1000] 7.72s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[True-100-10000] 7.72s call unit/runtime/zero/test_zero.py::TestZeroPartitionCache::test_training_partition_cache[False] 7.53s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningManyParams::test[False-100-1000] 7.52s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-False-True-False-0] 7.42s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-False-True-True-0] 7.23s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-False-False-True-0] 7.13s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-False-False-False-0] 7.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-True-True-0] 7.02s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningLargeParam::test[False] 7.02s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-True-False-False-False-0] 7.02s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-False-False-False-10] 7.02s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-False-False-0] 7.02s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-True-False-True-True-0] 7.02s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-True-True-10] 6.93s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-False-False-True-10] 6.92s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningLargeParam::test[True] 6.92s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-False-False-False-10] 6.83s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-False-True-10] 6.82s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-False-False-True-10] 6.82s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-False-False-False-10] 6.82s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-False-True-False-0] 6.72s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-False-True-True-0] 6.62s call unit/runtime/zero/test_zero.py::TestPartitionNcclAlignment::test 6.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-False-True-False-10] 6.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-True-False-0] 6.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-False-True-False-10] 6.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-False-True-False-10] 6.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-False-True-10] 6.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-False-True-False-0] 6.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-False-True-False-0] 6.52s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-False-False-True-10] 6.42s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-True-False-0] 6.42s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-True-True-10] 6.42s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-False-False-True-10] 6.42s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-True-True-0] 6.33s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-False-False-False-True-0] 6.22s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-False-False-False-0] 6.14s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-True-True-10] 6.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-False-False-True-True-0] 6.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-False-True-10] 6.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-False-False-False-0] 6.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-False-True-10] 6.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-False-True-0] 6.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-True-False-10] 6.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-False-True-False-10] 6.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-True-False-10] 6.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-True-False-0] 6.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-False-True-False-10] 6.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-False-False-True-False-10] 6.02s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-False-True-True-10] 6.02s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-True-True-False-10] 6.02s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-False-False-True-0] 5.92s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-False-False-True-0] 5.92s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-False-False-False-10] 5.92s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-False-True-0] 5.92s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-True-True-True-False-0] 5.90s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-True-True-10] 5.82s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-False-False-True-0] 5.82s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-False-True-True-0] 5.82s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-False-False-False-10] 5.81s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-True-False-False-False-10] 5.81s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-True-False-False-True-0] 5.72s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-False-False-0] 5.72s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-True-True-10] 5.72s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-False-False-False-0] 5.72s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-False-True-True-10] 5.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-True-False-0] 5.62s call unit/runtime/zero/test_zero.py::TestZero3InitForParentWeightInitialization::test 5.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-True-0] 5.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-True-True-False-10] 5.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-True-False-10] 5.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-True-True-0] 5.61s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-True-True-0] 5.52s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-True-False-10] 5.32s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-True-True-0] 5.31s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-True-True-False-10] 5.31s call unit/runtime/zero/test_zero.py::TestIncorectAllgatherBucketSize::test[1000] 5.29s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-False-0] 5.22s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-True-False-False-0] 5.22s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-False-False-0] 5.21s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-False-True-10] 5.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-False-False-0] 5.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-False-True-10] 5.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-True-True-False-False-10] 5.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-False-True-10] 5.12s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-False-False-0] 5.11s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-False-False-10] 5.02s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-False-False-0] 5.02s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-False-True-0] 4.92s call unit/runtime/zero/test_zero.py::TestZeroOffloadStage1::test 4.92s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-True-False-0] 4.91s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-True-True-10] 4.91s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-True-True-True-0] 4.91s call unit/runtime/zero/test_zero.py::TestIncorectAllgatherBucketSize::test[1001] 4.87s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-True-True-True-0] 4.82s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-False-True-10] 4.82s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-True-False-0] 4.81s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-False-0] 4.81s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-True-True-0] 4.77s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-True-False-10] 4.72s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-False-10] 4.72s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-True-False-0] 4.72s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-True-True-0] 4.71s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-True-10] 4.71s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-True-False-0] 4.71s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-True-0] 4.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-False-False-10] 4.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-True-True-False-10] 4.62s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-True-True-10] 4.41s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-False-True-0] 4.31s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-True-False-True-10] 4.22s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-False-True-0] 4.21s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-False-True-10] 4.21s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-False-False-0] 4.21s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-True-True-False-False-0] 4.21s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-True-True-False-True-10] 4.21s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-True-False-True-0] 4.21s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-True-False-False-10] 4.21s call unit/runtime/zero/test_zero.py::TestZeroOffloadOptim::test[False] 4.11s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-False-False-0] 4.01s call unit/runtime/zero/test_zero.py::TestZeroOffloadOptim::test[True] (625 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/runtime/zero/test_zero.py::TestZeroOffloadStage1::test FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-True-True-True-False-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-True-True-False-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-False-True-True-False-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-False-True-True-True-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-True-True-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-False-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-False-False-True-True-True-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-True-True-True-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-False-False-True-True-True-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-True-True-True-False-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-True-True-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-True-False-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-False-True-True-False-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-True-True-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-True-True-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-False-True-True-False-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-True-False-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-True-False-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-False-True-True-True-False-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-True-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_NamedTuple-True-True-True-True-True-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-False-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-False-True-True-True-False-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-True-True-True-True-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-True-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-True-False-True-True-False-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Dict-False-True-True-True-True-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_Tuple-True-False-True-True-False-10] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-True-True-True-False-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_namedtuple-True-False-True-True-True-0] FAILED unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test[EltwiseMultiplicationTestNetwork_List-True-False-False-True-True-10] ===== 32 failed, 120 passed, 160 skipped, 3 warnings in 1749.62s (0:29:09) ===== !!!!!!!!!!!!!!!!! _pytest.outcomes.Exit: Test hanged, exiting !!!!!!!!!!!!!!!!!! test_deepspeed_v0.9.2.sh: line 68: 549057 Killed pytest ./unit/runtime/zero/test_zero.py ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1253833508 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 5 items unit/runtime/zero/test_zero_config.py::test_zero_config_aliasfields PASSED [ 20%] unit/runtime/zero/test_zero_config.py::test_zero_config_offload_configs PASSED [ 40%] unit/runtime/zero/test_zero_config.py::test_zero_config_overlapcomm PASSED [ 60%] unit/runtime/zero/test_zero_config.py::test_zero_config_deprecatedfields PASSED [ 80%] unit/runtime/zero/test_zero_config.py::test_zero_offload_optimizer_config_pipeline PASSED [100%] =============================== warnings summary =============================== unit/runtime/zero/test_zero_config.py::test_zero_config_aliasfields /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/zero/test_zero_config.py::test_zero_config_aliasfields /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (15 durations < 1s hidden. Use -vv to show these durations.) ======================== 5 passed, 2 warnings in 2.11s ========================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1012909003 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 2 items unit/runtime/zero/test_zero_context_ancestry.py::TestSerialParamInit::test_subclass_param_init PASSED [ 50%] unit/runtime/zero/test_zero_context_ancestry.py::TestDSInitWZinit::test PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/zero/test_zero_context_ancestry.py::TestSerialParamInit::test_subclass_param_init /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/zero/test_zero_context_ancestry.py::TestSerialParamInit::test_subclass_param_init /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 147.50s call unit/runtime/zero/test_zero_context_ancestry.py::TestDSInitWZinit::test 8.21s call unit/runtime/zero/test_zero_context_ancestry.py::TestSerialParamInit::test_subclass_param_init (4 durations < 1s hidden. Use -vv to show these durations.) ================== 2 passed, 3 warnings in 157.86s (0:02:37) =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3274603534 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 20 items unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragment::test_zero_fragments[nvme-2-False] SKIPPED [ 5%] unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragment::test_zero_fragments[nvme-3-False] FAILED [ 10%] unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragment::test_zero_fragments[none-3-True] =================================== FAILURES =================================== _____________ TestTensorFragment.test_zero_fragments[nvme-3-False] _____________ Worker 0 exited with code 1 ----------------------------- Captured stdout call ----------------------------- [2023-05-27 06:12:30,640] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-05-27 06:12:39,750] [INFO] [partition_parameters.py:454:__exit__] finished initializing model with 0.00B parameters [2023-05-27 06:12:39,750] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.2+e0e8085, git-hash=e0e8085, git-branch=HEAD [2023-05-27 06:12:39,876] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False ----------------------------- Captured stderr call ----------------------------- WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 06:12:30.713459 599678 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! I0527 06:12:30.713455 599423 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET WARNING: Logging before InitGoogleLogging() is written to STDERR I0527 06:12:30.722805 599680 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 06:12:30.722796 599422 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 06:12:30.723155 599422 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 06:12:30.724145 599423 ProcessGroupNCCL.cpp:1669] Rank 1 using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device. I0527 06:12:39.753998 599422 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 06:12:39.754036 599704 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started! I0527 06:12:39.873292 599423 ProcessGroupNCCL.cpp:500] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 NCCL_DEBUG: UNSET I0527 06:12:39.873360 599705 ProcessGroupNCCL.cpp:601] [Rank 1] NCCL watchdog thread started! Process Process-3: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero_tensor_fragment.py", line 119, in test_zero_fragments run_fragmented_model(model, config_dict, hidden_dim, torch.float16) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero_tensor_fragment.py", line 57, in run_fragmented_model model, _, _, _ = deepspeed.initialize(model=model, model_parameters=model.parameters(), config=config_dict) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Process Process-4: Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 206, in _dist_init raise e File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 201, in _dist_init self.run(**self._fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/common.py", line 331, in run self._current_test(**fixture_kwargs) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero_tensor_fragment.py", line 119, in test_zero_fragments run_fragmented_model(model, config_dict, hidden_dim, torch.float16) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/zero/test_zero_tensor_fragment.py", line 57, in run_fragmented_model model, _, _, _ = deepspeed.initialize(model=model, model_parameters=model.parameters(), config=config_dict) File "/usr/local/lib/python3.7/site-packages/deepspeed/__init__.py", line 175, in initialize config_class=config_class) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 308, in __init__ self._configure_optimizer(optimizer, model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/usr/local/lib/python3.7/site-packages/deepspeed/runtime/engine.py", line 1220, in _configure_basic_optimizer adamw_mode=effective_adam_w_mode) File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/op_builder/builder.py", line 446, in load return importlib.import_module(self.absolute_name()) File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 670, in _load_unlocked File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: /usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam_op.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __kmpc_for_static_fini Exception ignored in: Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragment::test_zero_fragments[nvme-2-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragment::test_zero_fragments[nvme-2-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 418.83s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragment::test_zero_fragments[nvme-2-False] 13.73s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragment::test_zero_fragments[nvme-3-False] (5 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragment::test_zero_fragments[nvme-3-False] ============ 1 failed, 1 skipped, 3 warnings in 1034.30s (0:17:14) ============= !!!!!!!!!!!!!!!!! _pytest.outcomes.Exit: Test hanged, exiting !!!!!!!!!!!!!!!!!! ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2885004344 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 8 items unit/runtime/zero/test_zero_context.py::TestScatterGather::test PASSED [ 12%] unit/runtime/zero/test_zero_context.py::TestGatherUpdate::test PASSED [ 25%] unit/runtime/zero/test_zero_context.py::TestZeroGatheredParametersFree::test PASSED [ 37%] unit/runtime/zero/test_zero_context.py::TestSerialContext::test_ext_param_getattr PASSED [ 50%] unit/runtime/zero/test_zero_context.py::TestSerialContext::test_throughput_calculation PASSED [ 62%] unit/runtime/zero/test_zero_context.py::TestSerialContext::test_subclass_param PASSED [ 75%] unit/runtime/zero/test_zero_context.py::TestSerialContext::test_scatter_halftype PASSED [ 87%] unit/runtime/zero/test_zero_context.py::TestSerialContext::test_scattered_init_dist PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/zero/test_zero_context.py::TestScatterGather::test /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/zero/test_zero_context.py::TestScatterGather::test /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 593.69s call unit/runtime/zero/test_zero_context.py::TestGatherUpdate::test 58.84s call unit/runtime/zero/test_zero_context.py::TestScatterGather::test 11.54s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_ext_param_getattr 5.92s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_throughput_calculation 5.22s call unit/runtime/zero/test_zero_context.py::TestZeroGatheredParametersFree::test 5.12s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_scatter_halftype 5.02s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_subclass_param 4.92s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_scattered_init_dist (16 durations < 1s hidden. Use -vv to show these durations.) ================== 8 passed, 3 warnings in 691.26s (0:11:31) =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=438239727 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 5 items unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_ext_param_returnobj SKIPPED [ 20%] unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[None] PASSED [ 40%] unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[dict] PASSED [ 60%] unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[tensor] PASSED [ 80%] unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_ext_param_return PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[None] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[None] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 10.34s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[None] 10.12s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_ext_param_return 9.53s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[tensor] 9.23s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[dict] (10 durations < 1s hidden. Use -vv to show these durations.) ================== 4 passed, 1 skipped, 3 warnings in 40.73s =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3520976708 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 2 items unit/runtime/zero/test_zero_dynamic_class.py::TestNewClassDeclaredInsideInitFailure::test_new_class_declared_inside_init_failure PASSED [ 50%] unit/runtime/zero/test_zero_dynamic_class.py::TestNewClassDeclaredInsideInit::test_new_class_declared_inside_init PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/zero/test_zero_dynamic_class.py::TestNewClassDeclaredInsideInitFailure::test_new_class_declared_inside_init_failure /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/zero/test_zero_dynamic_class.py::TestNewClassDeclaredInsideInitFailure::test_new_class_declared_inside_init_failure /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.08s call unit/runtime/zero/test_zero_dynamic_class.py::TestNewClassDeclaredInsideInitFailure::test_new_class_declared_inside_init_failure 4.71s call unit/runtime/zero/test_zero_dynamic_class.py::TestNewClassDeclaredInsideInit::test_new_class_declared_inside_init (4 durations < 1s hidden. Use -vv to show these durations.) ======================== 2 passed, 3 warnings in 11.21s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1845754042 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 1 item unit/runtime/zero/test_zero_nesting_init.py::TestNestingInit::test_nesting_init PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/zero/test_zero_nesting_init.py::TestNestingInit::test_nesting_init /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/zero/test_zero_nesting_init.py::TestNestingInit::test_nesting_init /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 5.17s call unit/runtime/zero/test_zero_nesting_init.py::TestNestingInit::test_nesting_init (2 durations < 1s hidden. Use -vv to show these durations.) ======================== 1 passed, 3 warnings in 6.58s ========================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3161257696 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 42 items unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[29-23-1-1-False] SKIPPED [ 2%] unit/runtime/zero/test_zero_tiled.py::test_tiled_baddim[33-33] PASSED [ 4%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[23-29-2-2-True] SKIPPED [ 7%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[29-23-1-1-True] SKIPPED [ 9%] unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[23-29-2-2-False] SKIPPED [ 11%] unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[32-32-2-2-False] SKIPPED [ 14%] unit/runtime/zero/test_zero_tiled.py::test_tiled_baddim[0-0] PASSED [ 16%] unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[23-29-2-2-True] SKIPPED [ 19%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[32-32-1-1-True] SKIPPED [ 21%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[23-29-2-2-False] SKIPPED [ 23%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[32-32-1-1-False] SKIPPED [ 26%] unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[32-32-1-1-False] SKIPPED [ 28%] unit/runtime/zero/test_zero_tiled.py::test_tiled_init[1-1] PASSED [ 30%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[23-29-2-2-False] SKIPPED [ 33%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[32-32-1-1-True] SKIPPED [ 35%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[23-29-2-2-True] SKIPPED [ 38%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[32-32-2-2-True] SKIPPED [ 40%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[23-29-1-1-True] SKIPPED [ 42%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[32-32-2-2-False] SKIPPED [ 45%] unit/runtime/zero/test_zero_tiled.py::test_tiled_init[2-2] PASSED [ 47%] unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[32-32-1-1-True] SKIPPED [ 50%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[29-23-1-1-False] SKIPPED [ 52%] unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[29-23-2-2-True] SKIPPED [ 54%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[32-32-1-1-False] SKIPPED [ 57%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[23-29-1-1-True] SKIPPED [ 59%] unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[29-23-1-1-True] SKIPPED [ 61%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[29-23-2-2-True] SKIPPED [ 64%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[23-29-1-1-False] SKIPPED [ 66%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[29-23-1-1-True] SKIPPED [ 69%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[29-23-2-2-False] SKIPPED [ 71%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[23-29-1-1-False] SKIPPED [ 73%] unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[23-29-1-1-True] SKIPPED [ 76%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[29-23-1-1-False] SKIPPED [ 78%] unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[32-32-2-2-True] SKIPPED [ 80%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[32-32-2-2-False] SKIPPED [ 83%] unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[23-29-1-1-False] SKIPPED [ 85%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[29-23-2-2-False] SKIPPED [ 88%] unit/runtime/zero/test_zero_tiled.py::test_tiled_backward[29-23-2-2-True] SKIPPED [ 90%] unit/runtime/zero/test_zero_tiled.py::test_tiled_init[5-5] PASSED [ 92%] unit/runtime/zero/test_zero_tiled.py::test_tiled_init[32-32] PASSED [ 95%] unit/runtime/zero/test_zero_tiled.py::test_tiled_returnbias_backward[32-32-2-2-True] SKIPPED [ 97%] unit/runtime/zero/test_zero_tiled.py::test_tiled_forward[29-23-2-2-False] SKIPPED [100%] =============================== warnings summary =============================== unit/runtime/zero/test_zero_tiled.py::test_tiled_baddim[33-33] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/zero/test_zero_tiled.py::test_tiled_baddim[33-33] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (90 durations < 1s hidden. Use -vv to show these durations.) ================== 6 passed, 36 skipped, 2 warnings in 1.74s =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1323938708 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 24 items unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output3] PASSED [ 4%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[None] PASSED [ 8%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output1] PASSED [ 12%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output2] PASSED [ 16%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[non_tensor3] PASSED [ 20%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[non_tensor4] PASSED [ 25%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[2] PASSED [ 29%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[non_tensor4] PASSED [ 33%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[non_tensor3] PASSED [ 37%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[True] PASSED [ 41%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[2] PASSED [ 45%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[None] PASSED [ 50%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[None] PASSED [ 54%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[True] PASSED [ 58%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs3[mask1] PASSED [ 62%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs1_outputs1[mask0] PASSED [ 66%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs2[mask0] PASSED [ 70%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_arg_none[mask0] PASSED [ 75%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs2[mask1] PASSED [ 79%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs1_outputs1[mask1] PASSED [ 83%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs1[mask1] PASSED [ 87%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs3[mask0] PASSED [ 91%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_arg_none[mask1] PASSED [ 95%] unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs1[mask0] PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output3] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output3] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 9.33s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_arg_none[mask1] 9.03s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[non_tensor3] 8.73s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[non_tensor3] 8.53s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs2[mask1] 8.43s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs1[mask1] 8.43s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs1[mask0] 8.43s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[None] 8.43s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_arg_none[mask0] 8.42s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs3[mask0] 8.42s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs2[mask0] 8.39s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output3] 8.34s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs1_outputs1[mask1] 8.33s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[True] 8.32s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[non_tensor4] 8.23s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output2] 8.13s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs3[mask1] 8.13s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[2] 8.13s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[non_tensor4] 8.13s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[True] 8.12s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs1_outputs1[mask0] 8.03s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[2] 7.92s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[None] 7.92s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[None] 7.82s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output1] (48 durations < 1s hidden. Use -vv to show these durations.) ================== 24 passed, 3 warnings in 201.78s (0:03:21) ================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=634829518 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 3 items unit/runtime/comm/test_coalesced_collectives.py::TestReduceScatterCoalesced::test_single_input PASSED [ 33%] unit/runtime/comm/test_coalesced_collectives.py::TestReduceScatterCoalesced::test_two_inputs FAILED [ 66%] unit/runtime/comm/test_coalesced_collectives.py::TestReduceScatterCoalescedTensorSmallerThanWorldSize::test FAILED [100%] =================================== FAILURES =================================== __________________ TestReduceScatterCoalesced.test_two_inputs __________________ Worker 0 exited with code 1 ----------------------------- Captured stderr call ----------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/forkserver.py", line 281, in main old_handlers) File "/usr/local/lib/python3.7/multiprocessing/forkserver.py", line 317, in _serve_one code = spawn._main(child_r) File "/usr/local/lib/python3.7/multiprocessing/spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/comm/test_coalesced_collectives.py", line 9, in import torch ModuleNotFoundError: No module named 'torch' Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/forkserver.py", line 281, in main old_handlers) File "/usr/local/lib/python3.7/multiprocessing/forkserver.py", line 317, in _serve_one code = spawn._main(child_r) File "/usr/local/lib/python3.7/multiprocessing/spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/comm/test_coalesced_collectives.py", line 9, in import torch ModuleNotFoundError: No module named 'torch' __________ TestReduceScatterCoalescedTensorSmallerThanWorldSize.test ___________ Worker 0 exited with code 1 ----------------------------- Captured stderr call ----------------------------- Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/forkserver.py", line 281, in main old_handlers) File "/usr/local/lib/python3.7/multiprocessing/forkserver.py", line 317, in _serve_one code = spawn._main(child_r) File "/usr/local/lib/python3.7/multiprocessing/spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/comm/test_coalesced_collectives.py", line 9, in import torch ModuleNotFoundError: No module named 'torch' Traceback (most recent call last): File "/usr/local/lib/python3.7/multiprocessing/forkserver.py", line 281, in main old_handlers) File "/usr/local/lib/python3.7/multiprocessing/forkserver.py", line 317, in _serve_one code = spawn._main(child_r) File "/usr/local/lib/python3.7/multiprocessing/spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) File "/home/aishsh/ds-v0.9.2/tests/unit/runtime/comm/test_coalesced_collectives.py", line 9, in import torch ModuleNotFoundError: No module named 'torch' =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/comm/test_coalesced_collectives.py::TestReduceScatterCoalesced::test_single_input /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/comm/test_coalesced_collectives.py::TestReduceScatterCoalesced::test_single_input /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== 154.57s call unit/runtime/comm/test_coalesced_collectives.py::TestReduceScatterCoalesced::test_single_input (8 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED unit/runtime/comm/test_coalesced_collectives.py::TestReduceScatterCoalesced::test_two_inputs FAILED unit/runtime/comm/test_coalesced_collectives.py::TestReduceScatterCoalescedTensorSmallerThanWorldSize::test ============= 2 failed, 1 passed, 3 warnings in 157.02s (0:02:37) ============== ImportError while loading conftest '/home/aishsh/ds-v0.9.2/tests/conftest.py'. conftest.py:12: in import torch /usr/local/lib/python3.7/site-packages/torch/__init__.py:218: in from torch._C import * # noqa: F403 E ImportError: libshm.so: cannot open shared object file: No such file or directory ImportError while loading conftest '/home/aishsh/ds-v0.9.2/tests/conftest.py'. conftest.py:12: in import torch /usr/local/lib/python3.7/site-packages/torch/__init__.py:218: in from torch._C import * # noqa: F403 E ImportError: libshm.so: cannot open shared object file: No such file or directory ImportError while loading conftest '/home/aishsh/ds-v0.9.2/tests/conftest.py'. conftest.py:12: in import torch /usr/local/lib/python3.7/site-packages/torch/__init__.py:795: in from torch.amp import autocast E ModuleNotFoundError: No module named 'torch.amp' ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1649208435 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 8 items unit/runtime/test_autocast.py::TestAutoCastDisable::test_disable_autocast_linear[False] SKIPPED [ 12%] unit/runtime/test_autocast.py::TestAutoCastDisable::test_disable_autocast_linear[True] SKIPPED [ 25%] unit/runtime/test_autocast.py::TestAutoCastDisable::test_missing_amp_autocast[True] SKIPPED [ 37%] unit/runtime/test_autocast.py::TestAutoCastDisable::test_missing_amp_autocast[False] SKIPPED [ 50%] unit/runtime/test_autocast.py::TestAutoCastEnable::test_autocast_linear[True-True] SKIPPED [ 62%] unit/runtime/test_autocast.py::TestAutoCastEnable::test_autocast_linear[True-False] SKIPPED [ 75%] unit/runtime/test_autocast.py::TestAutoCastEnable::test_autocast_linear[False-True] SKIPPED [ 87%] unit/runtime/test_autocast.py::TestAutoCastEnable::test_autocast_linear[False-False] SKIPPED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/test_autocast.py::TestAutoCastDisable::test_disable_autocast_linear[False] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/test_autocast.py::TestAutoCastDisable::test_disable_autocast_linear[False] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (24 durations < 1s hidden. Use -vv to show these durations.) ======================== 8 skipped, 3 warnings in 3.53s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2061112492 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 5 items unit/runtime/test_data.py::TestDataLoaderDropLast::test[4-False] SKIPPED [ 20%] unit/runtime/test_data.py::TestDataLoaderDropLast::test[4-True] SKIPPED [ 40%] unit/runtime/test_data.py::TestDataLoaderDropLast::test[1-False] SKIPPED [ 60%] unit/runtime/test_data.py::TestDataLoaderDropLast::test[1-True] SKIPPED [ 80%] unit/runtime/test_data.py::test_repeating_loader PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/test_data.py::TestDataLoaderDropLast::test[4-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/test_data.py::TestDataLoaderDropLast::test[4-False] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (15 durations < 1s hidden. Use -vv to show these durations.) =================== 1 passed, 4 skipped, 3 warnings in 3.57s =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=1915273455 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 9 items unit/runtime/test_ds_config_model.py::test_config_base_deprecatedfail PASSED [ 11%] unit/runtime/test_ds_config_model.py::test_config_base_literalfail[config_dict1] PASSED [ 22%] unit/runtime/test_ds_config_model.py::test_config_base_literalfail[config_dict0] PASSED [ 33%] unit/runtime/test_ds_config_model.py::test_config_duplicate_key PASSED [ 44%] unit/runtime/test_ds_config_model.py::test_config_base_literalfail[config_dict2] PASSED [ 55%] unit/runtime/test_ds_config_model.py::test_config_base_aliasfield PASSED [ 66%] unit/runtime/test_ds_config_model.py::test_config_base PASSED [ 77%] unit/runtime/test_ds_config_model.py::test_config_base_deprecatedfield PASSED [ 88%] unit/runtime/test_ds_config_model.py::test_only_required_fields PASSED [100%] =============================== warnings summary =============================== unit/runtime/test_ds_config_model.py::test_config_base_deprecatedfail /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/test_ds_config_model.py::test_config_base_deprecatedfail /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (27 durations < 1s hidden. Use -vv to show these durations.) ======================== 9 passed, 2 warnings in 3.78s ========================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=757372250 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 39 items unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[WarmupDecayLR-params1] SKIPPED [ 2%] unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[WarmupLR-params0] SKIPPED [ 5%] unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[LRRangeTest-params3] SKIPPED [ 7%] unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[OneCycle-params2] SKIPPED [ 10%] unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[WarmupLR-params0] SKIPPED [ 12%] unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[OneCycle-params2] SKIPPED [ 15%] unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[WarmupDecayLR-params1] SKIPPED [ 17%] unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[LRRangeTest-params3] SKIPPED [ 20%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-15] SKIPPED [ 23%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-33] SKIPPED [ 25%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-10] SKIPPED [ 28%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-10] SKIPPED [ 30%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-10] SKIPPED [ 33%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-19] SKIPPED [ 35%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-15] SKIPPED [ 38%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-15] SKIPPED [ 41%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-33] SKIPPED [ 43%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-33] SKIPPED [ 46%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-19] SKIPPED [ 48%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-15] SKIPPED [ 51%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-19] SKIPPED [ 53%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-10] SKIPPED [ 56%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-33] SKIPPED [ 58%] unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-19] SKIPPED [ 61%] unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.01-0.001-10-101] SKIPPED [ 64%] unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0.001-101] SKIPPED [ 66%] unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0-210] SKIPPED [ 69%] unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0.001-100] SKIPPED [ 71%] unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.1-0-10-0] SKIPPED [ 74%] unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0-211] SKIPPED [ 76%] unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.01-0.001-10-100] SKIPPED [ 79%] unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[0.001-0.1-0-21-21] SKIPPED [ 82%] unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[0.001-0.1-0.1-21-21] SKIPPED [ 84%] unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.0001-0.001-10-True] SKIPPED [ 87%] unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.01-0.01-19-True] SKIPPED [ 89%] unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.0001-1e-05-1-True] SKIPPED [ 92%] unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.01-0.01-19-False] SKIPPED [ 94%] unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.001-0.001-10-False] SKIPPED [ 97%] unit/runtime/test_lr_schedulers.py::TestLrRange::test[1e-05-1e-05-1-False] SKIPPED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[WarmupDecayLR-params1] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[WarmupDecayLR-params1] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (117 durations < 1s hidden. Use -vv to show these durations.) ======================= 39 skipped, 3 warnings in 3.64s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3416726566 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 4 items unit/runtime/test_runtime_utils.py::TestClibGradNorm::test SKIPPED (...) [ 25%] unit/runtime/test_runtime_utils.py::TestCheckOverflow::test[True] SKIPPED [ 50%] unit/runtime/test_runtime_utils.py::TestCheckOverflow::test[False] SKIPPED [ 75%] unit/runtime/test_runtime_utils.py::test_call_to_str PASSED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/runtime/test_runtime_utils.py::TestClibGradNorm::test /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/runtime/test_runtime_utils.py::TestClibGradNorm::test /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (12 durations < 1s hidden. Use -vv to show these durations.) =================== 1 passed, 3 skipped, 3 warnings in 3.59s =================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=3816794595 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 2 items unit/utils/test_init_on_device.py::TestOnDevice::test_on_device[cuda:0] SKIPPED [ 50%] unit/utils/test_init_on_device.py::TestOnDevice::test_on_device[meta] SKIPPED [100%] =============================== warnings summary =============================== ../../../../usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8 /usr/local/lib/python3.7/site-packages/_pytest/fixtures.py:8: PytestDeprecationWarning: A private pytest class or function was used. from collections import deque unit/utils/test_init_on_device.py::TestOnDevice::test_on_device[cuda:0] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/utils/test_init_on_device.py::TestOnDevice::test_on_device[cuda:0] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (6 durations < 1s hidden. Use -vv to show these durations.) ======================== 2 skipped, 3 warnings in 3.50s ======================== ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=2531022313 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 4 items unit/utils/test_get_optim_files.py::test_get_optim_files[24] PASSED [ 25%] unit/utils/test_get_optim_files.py::test_get_optim_files[2] PASSED [ 50%] unit/utils/test_get_optim_files.py::test_get_optim_files[1] PASSED [ 75%] unit/utils/test_get_optim_files.py::test_get_optim_files[12] PASSED [100%] =============================== warnings summary =============================== unit/utils/test_get_optim_files.py::test_get_optim_files[24] /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/utils/test_get_optim_files.py::test_get_optim_files[24] /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (12 durations < 1s hidden. Use -vv to show these durations.) ======================== 4 passed, 2 warnings in 3.60s ========================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=591940970 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 1 item unit/utils/test_groups.py::test_get_expert_parallel_ranks PASSED [100%] =============================== warnings summary =============================== unit/utils/test_groups.py::test_get_expert_parallel_ranks /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") unit/utils/test_groups.py::test_get_expert_parallel_ranks /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (3 durations < 1s hidden. Use -vv to show these durations.) ======================== 1 passed, 2 warnings in 3.64s ========================= ============================= test session starts ============================== platform linux -- Python 3.7.12, pytest-7.3.1, pluggy-1.0.0 -- /usr/local/bin/python3.7 cachedir: .pytest_cache Using --randomly-seed=308506467 rootdir: /home/aishsh/ds-v0.9.2/tests configfile: pytest.ini plugins: xdist-3.2.1, randomly-3.12.0, forked-1.6.0 collecting ... collected 1 item accelerator/test_ds_init.py::test_literal_device FAILED [100%] =================================== FAILURES =================================== _____________________________ test_literal_device ______________________________ def test_literal_device(): model = OneLayerNet(128, 128)  os.environ['RANK'] = '0' os.environ['WORLD_SIZE'] = '1' os.environ['MASTER_ADDR'] = '127.0.0.1' os.environ['MASTER_PORT'] = '8088' os.environ['LOCAL_RANK'] = '0' > deepspeed.init_distributed(get_accelerator().communication_backend_name()) accelerator/test_ds_init.py:41: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/local/lib/python3.7/site-packages/deepspeed/comm/comm.py:624: in init_distributed cdb = TorchBackend(dist_backend, timeout, init_method, rank, world_size) /usr/local/lib/python3.7/site-packages/deepspeed/comm/torch.py:60: in __init__ self.init_process_group(backend, timeout, init_method, rank, world_size) /usr/local/lib/python3.7/site-packages/deepspeed/comm/torch.py:90: in init_process_group world_size=world_size) /usr/local/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py:769: in init_process_group timeout=timeout, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ group_size = 1, group_rank = 0, global_ranks_in_group = [], backend = 'nccl' store = pg_options = group_name = '0', timeout = datetime.timedelta(seconds=1800) def _new_process_group_helper( group_size, group_rank, global_ranks_in_group, backend, store, pg_options=None, group_name=None, timeout=default_pg_timeout, ):  """  Create a new distributed process group.   This function must be called by ALL processes in the global group, even if  the calling process is not part of the newly created group. In that case,  this function returns GroupMember.NON_GROUP_MEMBER.   This function is called with ``group_ranks == []`` for the default group.  """ global _pg_map global _group_count global _pg_names  if not group_name: group_name = str(_group_count) _group_count += 1  if group_name in _pg_names.values(): raise RuntimeError( "The specified group name has already been " "created, please use a different group name" )  if not isinstance(timeout, timedelta): raise RuntimeError( "Expected timeout argument to be of type" "datetime.timedelta" )  # The list of group ranks is empty if we're creating the default group. is_default_group = len(global_ranks_in_group) == 0  backend = Backend(backend) pg: Union[ProcessGroupGloo, ProcessGroupMPI, ProcessGroupNCCL, ProcessGroupUCC] if backend == Backend.MPI: if not is_mpi_available(): raise RuntimeError( "Distributed package doesn't have MPI built in." " MPI is only included if you build PyTorch from" " source on a host that has MPI installed." ) pg = ProcessGroupMPI.create(global_ranks_in_group) if not pg: return GroupMember.NON_GROUP_MEMBER _pg_map[pg] = (Backend.MPI, None) _pg_names[pg] = group_name else: # If this is a subgroup (which means group_ranks is specified), # we check if the current process is a member of the new group. if not is_default_group: global_rank = _get_default_group().rank() if global_rank not in global_ranks_in_group: return GroupMember.NON_GROUP_MEMBER  # Use the group name as prefix in the default store, such that # a single store can be reused by multiple groups. prefix_store = PrefixStore(group_name, store)  if backend == Backend.GLOO: if pg_options is not None: raise RuntimeError("GLOO options not supported") pg = ProcessGroupGloo(prefix_store, group_rank, group_size, timeout=timeout) # In debug mode and if GLOO is available, wrap in a wrapper PG that # enables enhanced collective checking for debugability. if get_debug_level() == DebugLevel.DETAIL: if not _GLOO_AVAILABLE: logger.info(  """TORCH_DISTRIBUTED_DEBUG was set to DETAIL, but  GLOO is not available. Build with Gloo to  create a wrapper process group in debug mode  to aid collective desynchronization debugging.""" ) else: pg = _create_process_group_wrapper( wrapped_pg=pg, store_prefix=group_name, store=store, rank=group_rank, world_size=group_size, timeout=timeout, ) _pg_map[pg] = (Backend.GLOO, store) _pg_names[pg] = group_name elif backend == Backend.NCCL: if not is_nccl_available(): raise RuntimeError("Distributed package doesn't have NCCL " "built in") if pg_options is not None: assert isinstance( pg_options, ProcessGroupNCCL.Options ), "Expected pg_options argument to be of type ProcessGroupNCCL.Options" else: # default pg_options for NCCL pg_options = ProcessGroupNCCL.Options() pg_options.is_high_priority_stream = False pg_options._timeout = timeout  > pg = ProcessGroupNCCL(prefix_store, group_rank, group_size, pg_options) E RuntimeError: ProcessGroupNCCL is only supported with GPUs, no GPUs found! /usr/local/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py:897: RuntimeError ----------------------------- Captured stdout call ----------------------------- [2023-05-27 06:42:29,727] [INFO] [comm.py:622:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl =============================== warnings summary =============================== accelerator/test_ds_init.py::test_literal_device /home/aishsh/ds-v0.9.2/tests/conftest.py:48: UserWarning: Running test without verifying torch version, please provide an expected torch version with --torch_ver "Running test without verifying torch version, please provide an expected torch version with --torch_ver") accelerator/test_ds_init.py::test_literal_device /home/aishsh/ds-v0.9.2/tests/conftest.py:55: UserWarning: Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver "Running test without verifying cuda version, please provide an expected cuda version with --cuda_ver") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================== slowest durations =============================== (3 durations < 1s hidden. Use -vv to show these durations.) =========================== short test summary info ============================ FAILED accelerator/test_ds_init.py::test_literal_device - RuntimeError: ProcessGroupNCCL is only supported with GPUs, no GPUs found! ======================== 1 failed, 2 warnings in 4.18s =========================