WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.694067 13004 ProcessGroupNCCL.cpp:835] [Rank 13] NCCL watchdog thread started! I1027 11:40:59.694059 12119 ProcessGroupNCCL.cpp:669] [Rank 13] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.689687 14514 ProcessGroupNCCL.cpp:835] [Rank 66] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.701745 16391 ProcessGroupNCCL.cpp:835] [Rank 38] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.695235 16070 ProcessGroupNCCL.cpp:835] [Rank 78] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.699540 17423 ProcessGroupNCCL.cpp:835] [Rank 50] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.701830 15702 ProcessGroupNCCL.cpp:835] [Rank 46] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.695412 6383 ProcessGroupNCCL.cpp:835] [Rank 82] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.701686 17105 ProcessGroupNCCL.cpp:835] [Rank 58] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.701151 2628 ProcessGroupNCCL.cpp:835] [Rank 70] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.696023 21980 ProcessGroupNCCL.cpp:835] [Rank 6] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.689921 15651 ProcessGroupNCCL.cpp:835] [Rank 86] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.678339 14673 ProcessGroupNCCL.cpp:835] [Rank 26] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.690449 16988 ProcessGroupNCCL.cpp:835] [Rank 42] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.695075 6529 ProcessGroupNCCL.cpp:835] [Rank 30] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.695425 1875 ProcessGroupNCCL.cpp:669] [Rank 22] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.677758 6910 ProcessGroupNCCL.cpp:835] [Rank 10] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.696250 19750 ProcessGroupNCCL.cpp:669] [Rank 74] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.702153 18814 ProcessGroupNCCL.cpp:835] [Rank 54] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.699782 16112 ProcessGroupNCCL.cpp:669] [Rank 90] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.696225 20964 ProcessGroupNCCL.cpp:835] [Rank 18] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.702178 19957 ProcessGroupNCCL.cpp:669] [Rank 34] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.689674 13567 ProcessGroupNCCL.cpp:669] [Rank 66] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.701737 15593 ProcessGroupNCCL.cpp:669] [Rank 38] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.695227 15274 ProcessGroupNCCL.cpp:669] [Rank 78] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.699527 16693 ProcessGroupNCCL.cpp:669] [Rank 50] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.701823 14882 ProcessGroupNCCL.cpp:669] [Rank 46] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.695387 5793 ProcessGroupNCCL.cpp:669] [Rank 82] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.701678 16266 ProcessGroupNCCL.cpp:669] [Rank 58] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.701143 2012 ProcessGroupNCCL.cpp:669] [Rank 70] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.696029 21054 ProcessGroupNCCL.cpp:669] [Rank 6] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.689911 14827 ProcessGroupNCCL.cpp:669] [Rank 86] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.678334 13847 ProcessGroupNCCL.cpp:669] [Rank 26] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.690456 16148 ProcessGroupNCCL.cpp:669] [Rank 42] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.695065 5835 ProcessGroupNCCL.cpp:669] [Rank 30] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.695485 2521 ProcessGroupNCCL.cpp:835] [Rank 22] NCCL watchdog thread started! I1027 11:40:59.677734 6029 ProcessGroupNCCL.cpp:669] [Rank 10] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.696282 20639 ProcessGroupNCCL.cpp:835] [Rank 74] NCCL watchdog thread started! I1027 11:40:59.702147 18005 ProcessGroupNCCL.cpp:669] [Rank 54] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.699805 16959 ProcessGroupNCCL.cpp:835] [Rank 90] NCCL watchdog thread started! I1027 11:40:59.696215 20256 ProcessGroupNCCL.cpp:669] [Rank 18] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.702189 20824 ProcessGroupNCCL.cpp:835] [Rank 34] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.690634 16989 ProcessGroupNCCL.cpp:835] [Rank 43] NCCL watchdog thread started! I1027 11:40:59.690613 16146 ProcessGroupNCCL.cpp:669] [Rank 43] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.699772 23262 ProcessGroupNCCL.cpp:835] [Rank 62] NCCL watchdog thread started! I1027 11:40:59.699754 22359 ProcessGroupNCCL.cpp:669] [Rank 62] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.699798 23263 ProcessGroupNCCL.cpp:835] [Rank 61] NCCL watchdog thread started! I1027 11:40:59.699793 22361 ProcessGroupNCCL.cpp:669] [Rank 61] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.704620 6273 ProcessGroupNCCL.cpp:835] [Rank 94] NCCL watchdog thread started! I1027 11:40:59.704631 5555 ProcessGroupNCCL.cpp:669] [Rank 94] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.706770 13006 ProcessGroupNCCL.cpp:835] [Rank 14] NCCL watchdog thread started! I1027 11:40:59.706761 12117 ProcessGroupNCCL.cpp:669] [Rank 14] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.901784 18816 ProcessGroupNCCL.cpp:835] [Rank 55] NCCL watchdog thread started! I1027 11:40:59.901772 18006 ProcessGroupNCCL.cpp:669] [Rank 55] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.896070 20257 ProcessGroupNCCL.cpp:669] [Rank 19] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.896072 20966 ProcessGroupNCCL.cpp:835] [Rank 19] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.878460 14675 ProcessGroupNCCL.cpp:835] [Rank 27] NCCL watchdog thread started! I1027 11:40:59.878463 13849 ProcessGroupNCCL.cpp:669] [Rank 27] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.890164 14516 ProcessGroupNCCL.cpp:835] [Rank 67] NCCL watchdog thread started! I1027 11:40:59.890167 13569 ProcessGroupNCCL.cpp:669] [Rank 67] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.901363 2011 ProcessGroupNCCL.cpp:669] [Rank 71] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.901387 2630 ProcessGroupNCCL.cpp:835] [Rank 71] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.900033 16692 ProcessGroupNCCL.cpp:669] [Rank 51] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.900041 17425 ProcessGroupNCCL.cpp:835] [Rank 51] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.890388 15653 ProcessGroupNCCL.cpp:835] [Rank 87] NCCL watchdog thread started! I1027 11:40:59.890359 14824 ProcessGroupNCCL.cpp:669] [Rank 87] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.896361 21052 ProcessGroupNCCL.cpp:669] [Rank 7] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.902295 19958 ProcessGroupNCCL.cpp:669] [Rank 35] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.902237 17107 ProcessGroupNCCL.cpp:835] [Rank 59] NCCL watchdog thread started! I1027 11:40:59.896369 21982 ProcessGroupNCCL.cpp:835] [Rank 7] NCCL watchdog thread started! I1027 11:40:59.902320 20826 ProcessGroupNCCL.cpp:835] [Rank 35] NCCL watchdog thread started! I1027 11:40:59.902222 16263 ProcessGroupNCCL.cpp:669] [Rank 59] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.896574 19751 ProcessGroupNCCL.cpp:669] [Rank 75] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.895948 16072 ProcessGroupNCCL.cpp:835] [Rank 79] NCCL watchdog thread started! I1027 11:40:59.896596 20641 ProcessGroupNCCL.cpp:835] [Rank 75] NCCL watchdog thread started! I1027 11:40:59.895939 15275 ProcessGroupNCCL.cpp:669] [Rank 79] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.896593 6275 ProcessGroupNCCL.cpp:835] [Rank 95] NCCL watchdog thread started! I1027 11:40:59.896600 5556 ProcessGroupNCCL.cpp:669] [Rank 95] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.896472 13023 ProcessGroupNCCL.cpp:835] [Rank 15] NCCL watchdog thread started! I1027 11:40:59.896476 12118 ProcessGroupNCCL.cpp:669] [Rank 15] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.896719 30864 ProcessGroupNCCL.cpp:835] [Rank 2] NCCL watchdog thread started! I1027 11:40:59.896713 29772 ProcessGroupNCCL.cpp:669] [Rank 2] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.896756 1873 ProcessGroupNCCL.cpp:669] [Rank 23] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.879051 6912 ProcessGroupNCCL.cpp:835] [Rank 11] NCCL watchdog thread started! I1027 11:40:59.896766 2524 ProcessGroupNCCL.cpp:835] [Rank 23] NCCL watchdog thread started! I1027 11:40:59.879048 6030 ProcessGroupNCCL.cpp:669] [Rank 11] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.903501 15704 ProcessGroupNCCL.cpp:835] [Rank 47] NCCL watchdog thread started! I1027 11:40:59.903491 14883 ProcessGroupNCCL.cpp:669] [Rank 47] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.903488 16393 ProcessGroupNCCL.cpp:835] [Rank 39] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.897190 22358 ProcessGroupNCCL.cpp:669] [Rank 63] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.903478 15590 ProcessGroupNCCL.cpp:669] [Rank 39] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.897241 23265 ProcessGroupNCCL.cpp:835] [Rank 63] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.897130 6385 ProcessGroupNCCL.cpp:835] [Rank 83] NCCL watchdog thread started! I1027 11:40:59.897117 5796 ProcessGroupNCCL.cpp:669] [Rank 83] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.901348 16961 ProcessGroupNCCL.cpp:835] [Rank 91] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:40:59.896620 6532 ProcessGroupNCCL.cpp:835] [Rank 31] NCCL watchdog thread started! I1027 11:40:59.901336 16114 ProcessGroupNCCL.cpp:669] [Rank 91] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:40:59.896611 5836 ProcessGroupNCCL.cpp:669] [Rank 31] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.688014 15656 ProcessGroupNCCL.cpp:835] [Rank 84] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.688114 14535 ProcessGroupNCCL.cpp:835] [Rank 64] NCCL watchdog thread started! I1027 11:41:00.688097 13566 ProcessGroupNCCL.cpp:669] [Rank 64] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.687999 14825 ProcessGroupNCCL.cpp:669] [Rank 84] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.688239 14536 ProcessGroupNCCL.cpp:835] [Rank 65] NCCL watchdog thread started! I1027 11:41:00.688216 13568 ProcessGroupNCCL.cpp:669] [Rank 65] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.697958 17004 ProcessGroupNCCL.cpp:835] [Rank 88] NCCL watchdog thread started! I1027 11:41:00.697953 16113 ProcessGroupNCCL.cpp:669] [Rank 88] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.698066 17005 ProcessGroupNCCL.cpp:835] [Rank 89] NCCL watchdog thread started! I1027 11:41:00.698074 16115 ProcessGroupNCCL.cpp:669] [Rank 89] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.688750 15658 ProcessGroupNCCL.cpp:835] [Rank 85] NCCL watchdog thread started! I1027 11:41:00.688750 14826 ProcessGroupNCCL.cpp:669] [Rank 85] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.694737 21985 ProcessGroupNCCL.cpp:835] [Rank 5] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.694449 23268 ProcessGroupNCCL.cpp:835] [Rank 60] NCCL watchdog thread started! I1027 11:41:00.694729 21051 ProcessGroupNCCL.cpp:669] [Rank 5] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.694425 22360 ProcessGroupNCCL.cpp:669] [Rank 60] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.700575 19959 ProcessGroupNCCL.cpp:669] [Rank 32] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.700656 20844 ProcessGroupNCCL.cpp:835] [Rank 32] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.700824 15707 ProcessGroupNCCL.cpp:835] [Rank 44] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.677176 14677 ProcessGroupNCCL.cpp:835] [Rank 24] NCCL watchdog thread started! I1027 11:41:00.677165 13850 ProcessGroupNCCL.cpp:669] [Rank 24] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.694236 2526 ProcessGroupNCCL.cpp:835] [Rank 20] NCCL watchdog thread started! I1027 11:41:00.700820 14885 ProcessGroupNCCL.cpp:669] [Rank 44] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.694229 1872 ProcessGroupNCCL.cpp:669] [Rank 20] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.676532 6032 ProcessGroupNCCL.cpp:669] [Rank 8] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.698663 16695 ProcessGroupNCCL.cpp:669] [Rank 48] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.698670 17428 ProcessGroupNCCL.cpp:835] [Rank 48] NCCL watchdog thread started! I1027 11:41:00.676607 6915 ProcessGroupNCCL.cpp:835] [Rank 8] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.695044 20968 ProcessGroupNCCL.cpp:835] [Rank 16] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.695091 21987 ProcessGroupNCCL.cpp:835] [Rank 4] NCCL watchdog thread started! I1027 11:41:00.695036 20254 ProcessGroupNCCL.cpp:669] [Rank 16] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.695093 21053 ProcessGroupNCCL.cpp:669] [Rank 4] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.705013 16075 ProcessGroupNCCL.cpp:835] [Rank 76] NCCL watchdog thread started! I1027 11:41:00.705024 15273 ProcessGroupNCCL.cpp:669] [Rank 76] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.900769 18819 ProcessGroupNCCL.cpp:835] [Rank 52] NCCL watchdog thread started! I1027 11:41:00.900777 18003 ProcessGroupNCCL.cpp:669] [Rank 52] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.900789 2633 ProcessGroupNCCL.cpp:835] [Rank 68] NCCL watchdog thread started! I1027 11:41:00.900777 2010 ProcessGroupNCCL.cpp:669] [Rank 68] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.895587 6278 ProcessGroupNCCL.cpp:835] [Rank 92] NCCL watchdog thread started! I1027 11:41:00.895591 5554 ProcessGroupNCCL.cpp:669] [Rank 92] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.895395 13026 ProcessGroupNCCL.cpp:835] [Rank 12] NCCL watchdog thread started! I1027 11:41:00.895373 12120 ProcessGroupNCCL.cpp:669] [Rank 12] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.901515 17121 ProcessGroupNCCL.cpp:835] [Rank 56] NCCL watchdog thread started! I1027 11:41:00.901512 16264 ProcessGroupNCCL.cpp:669] [Rank 56] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.894690 6570 ProcessGroupNCCL.cpp:835] [Rank 28] NCCL watchdog thread started! I1027 11:41:00.894701 5834 ProcessGroupNCCL.cpp:669] [Rank 28] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.895599 29771 ProcessGroupNCCL.cpp:669] [Rank 1] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.895607 30867 ProcessGroupNCCL.cpp:835] [Rank 1] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.901806 15591 ProcessGroupNCCL.cpp:669] [Rank 36] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.901808 16396 ProcessGroupNCCL.cpp:835] [Rank 36] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.899602 17430 ProcessGroupNCCL.cpp:835] [Rank 49] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.901721 17122 ProcessGroupNCCL.cpp:835] [Rank 57] NCCL watchdog thread started! I1027 11:41:00.899591 16694 ProcessGroupNCCL.cpp:669] [Rank 49] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.901712 16265 ProcessGroupNCCL.cpp:669] [Rank 57] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.896236 20650 ProcessGroupNCCL.cpp:835] [Rank 72] NCCL watchdog thread started! I1027 11:41:00.896224 19752 ProcessGroupNCCL.cpp:669] [Rank 72] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.895573 16077 ProcessGroupNCCL.cpp:835] [Rank 77] NCCL watchdog thread started! I1027 11:41:00.895565 15272 ProcessGroupNCCL.cpp:669] [Rank 77] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.896273 20651 ProcessGroupNCCL.cpp:835] [Rank 73] NCCL watchdog thread started! I1027 11:41:00.896270 19749 ProcessGroupNCCL.cpp:669] [Rank 73] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.901351 2636 ProcessGroupNCCL.cpp:835] [Rank 69] NCCL watchdog thread started! I1027 11:41:00.901343 2009 ProcessGroupNCCL.cpp:669] [Rank 69] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.877744 6922 ProcessGroupNCCL.cpp:835] [Rank 9] NCCL watchdog thread started! I1027 11:41:00.877739 6031 ProcessGroupNCCL.cpp:669] [Rank 9] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.895627 1874 ProcessGroupNCCL.cpp:669] [Rank 21] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.895638 2528 ProcessGroupNCCL.cpp:835] [Rank 21] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.895241 6572 ProcessGroupNCCL.cpp:835] [Rank 29] NCCL watchdog thread started! I1027 11:41:00.895221 5833 ProcessGroupNCCL.cpp:669] [Rank 29] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.902349 14884 ProcessGroupNCCL.cpp:669] [Rank 45] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.902395 15709 ProcessGroupNCCL.cpp:835] [Rank 45] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.902419 18821 ProcessGroupNCCL.cpp:835] [Rank 53] NCCL watchdog thread started! I1027 11:41:00.902426 18004 ProcessGroupNCCL.cpp:669] [Rank 53] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.902416 20846 ProcessGroupNCCL.cpp:835] [Rank 33] NCCL watchdog thread started! I1027 11:41:00.902411 19960 ProcessGroupNCCL.cpp:669] [Rank 33] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.890990 16999 ProcessGroupNCCL.cpp:835] [Rank 41] NCCL watchdog thread started! I1027 11:41:00.890982 16145 ProcessGroupNCCL.cpp:669] [Rank 41] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.896215 5794 ProcessGroupNCCL.cpp:669] [Rank 80] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.896296 6388 ProcessGroupNCCL.cpp:835] [Rank 80] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.897186 29773 ProcessGroupNCCL.cpp:669] [Rank 3] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.897205 30869 ProcessGroupNCCL.cpp:835] [Rank 3] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.897157 6390 ProcessGroupNCCL.cpp:835] [Rank 81] NCCL watchdog thread started! I1027 11:41:00.897150 5795 ProcessGroupNCCL.cpp:669] [Rank 81] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.892146 17001 ProcessGroupNCCL.cpp:835] [Rank 40] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.897753 6280 ProcessGroupNCCL.cpp:835] [Rank 93] NCCL watchdog thread started! I1027 11:41:00.892140 16147 ProcessGroupNCCL.cpp:669] [Rank 40] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:00.897747 5553 ProcessGroupNCCL.cpp:669] [Rank 93] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:00.897836 20971 ProcessGroupNCCL.cpp:835] [Rank 17] NCCL watchdog thread started! I1027 11:41:00.897830 20255 ProcessGroupNCCL.cpp:669] [Rank 17] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:01.677418 13848 ProcessGroupNCCL.cpp:669] [Rank 25] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 I1027 11:41:01.677438 14968 ProcessGroupNCCL.cpp:835] [Rank 25] NCCL watchdog thread started! WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:01.702608 16478 ProcessGroupNCCL.cpp:835] [Rank 37] NCCL watchdog thread started! I1027 11:41:01.702615 15592 ProcessGroupNCCL.cpp:669] [Rank 37] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1027 11:41:01.701503 31064 ProcessGroupNCCL.cpp:835] [Rank 0] NCCL watchdog thread started! I1027 11:41:01.701501 29770 ProcessGroupNCCL.cpp:669] [Rank 0] ProcessGroupNCCL initialized with following options: NCCL_ASYNC_ERROR_HANDLING: 0 NCCL_DESYNC_DEBUG: 0 NCCL_BLOCKING_WAIT: 0 TIMEOUT(ms): 1800000 USE_HIGH_PRIORITY_STREAM: 0 Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Traceback (most recent call last): File "../fine-tune.py", line 159, in train() File "../fine-tune.py", line 120, in train model = transformers.AutoModelForCausalLM.from_pretrained( File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 466, in from_pretrained return model_class.from_pretrained( File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 685, in from_pretrained return super(BaichuanForCausalLM, cls).from_pretrained(pretrained_model_name_or_path, *model_args, File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2629, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 555, in __init__ self.model = BaichuanModel(config) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 356, in __init__ self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 141, in __init__ self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs)) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 229, in wrapped_fn tensor: Tensor = fn(*args, **kwargs) RuntimeError: HIP error: initialization error HIP kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing HIP_LAUNCH_BLOCKING=1. Traceback (most recent call last): File "../fine-tune.py", line 159, in train() File "../fine-tune.py", line 120, in train model = transformers.AutoModelForCausalLM.from_pretrained( File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 466, in from_pretrained return model_class.from_pretrained( File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 685, in from_pretrained return super(BaichuanForCausalLM, cls).from_pretrained(pretrained_model_name_or_path, *model_args, File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2629, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 555, in __init__ self.model = BaichuanModel(config) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 356, in __init__ self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 141, in __init__ self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs)) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 229, in wrapped_fn tensor: Tensor = fn(*args, **kwargs) RuntimeError: HIP error: initialization error HIP kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing HIP_LAUNCH_BLOCKING=1. I1027 11:41:27.398515 14536 ProcessGroupNCCL.cpp:837] [Rank 65] NCCL watchdog thread terminated normally I1027 11:41:27.399645 14535 ProcessGroupNCCL.cpp:837] [Rank 64] NCCL watchdog thread terminated normally Traceback (most recent call last): File "../fine-tune.py", line 159, in train() File "../fine-tune.py", line 120, in train model = transformers.AutoModelForCausalLM.from_pretrained( File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 466, in from_pretrained return model_class.from_pretrained( File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 685, in from_pretrained return super(BaichuanForCausalLM, cls).from_pretrained(pretrained_model_name_or_path, *model_args, File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2629, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 555, in __init__ self.model = BaichuanModel(config) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 356, in __init__ self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 141, in __init__ self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs)) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 229, in wrapped_fn tensor: Tensor = fn(*args, **kwargs) RuntimeError: HIP error: initialization error HIP kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing HIP_LAUNCH_BLOCKING=1. I1027 11:41:28.484403 14514 ProcessGroupNCCL.cpp:837] [Rank 66] NCCL watchdog thread terminated normally Traceback (most recent call last): File "../fine-tune.py", line 159, in train() File "../fine-tune.py", line 120, in train model = transformers.AutoModelForCausalLM.from_pretrained( File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 466, in from_pretrained return model_class.from_pretrained( File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 685, in from_pretrained return super(BaichuanForCausalLM, cls).from_pretrained(pretrained_model_name_or_path, *model_args, File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2629, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 555, in __init__ self.model = BaichuanModel(config) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 356, in __init__ self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 141, in __init__ self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs)) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 229, in wrapped_fn tensor: Tensor = fn(*args, **kwargs) RuntimeError: HIP error: initialization error HIP kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing HIP_LAUNCH_BLOCKING=1. I1027 11:41:29.061339 14968 ProcessGroupNCCL.cpp:837] [Rank 25] NCCL watchdog thread terminated normally Traceback (most recent call last): File "../fine-tune.py", line 159, in train() File "../fine-tune.py", line 120, in train model = transformers.AutoModelForCausalLM.from_pretrained( File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 466, in from_pretrained return model_class.from_pretrained( File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 685, in from_pretrained return super(BaichuanForCausalLM, cls).from_pretrained(pretrained_model_name_or_path, *model_args, File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2629, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 555, in __init__ self.model = BaichuanModel(config) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 356, in __init__ self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 141, in __init__ self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs)) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 229, in wrapped_fn tensor: Tensor = fn(*args, **kwargs) RuntimeError: HIP error: initialization error HIP kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing HIP_LAUNCH_BLOCKING=1. I1027 11:41:29.699555 14673 ProcessGroupNCCL.cpp:837] [Rank 26] NCCL watchdog thread terminated normally Traceback (most recent call last): File "../fine-tune.py", line 159, in train() File "../fine-tune.py", line 120, in train model = transformers.AutoModelForCausalLM.from_pretrained( File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 466, in from_pretrained return model_class.from_pretrained( File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 685, in from_pretrained return super(BaichuanForCausalLM, cls).from_pretrained(pretrained_model_name_or_path, *model_args, File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2629, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 555, in __init__ self.model = BaichuanModel(config) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/.cache/huggingface/modules/transformers_modules/baichuan2-7b-base/modeling_baichuan.py", line 356, in __init__ self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 382, in wrapper f(module, *args, **kwargs) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 141, in __init__ self.weight = Parameter(torch.empty((num_embeddings, embedding_dim), **factory_kwargs)) File "/public/home/zhaoying1/anaconda3/envs/baichuan/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 229, in wrapped_fn tensor: Tensor = fn(*args, **kwargs) RuntimeError: HIP error: initialization error HIP kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing HIP_LAUNCH_BLOCKING=1. I1027 11:41:31.941821 14677 ProcessGroupNCCL.cpp:837] [Rank 24] NCCL watchdog thread terminated normally slurmstepd: error: *** JOB 45668680 ON b17r3n15 CANCELLED AT 2023-10-31T16:57:25 ***