logs-z100.txt 200 KB
Newer Older
unknown's avatar
unknown committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1017 06:19:56.048682 13361 ProcessGroupNCCL.cpp:500] [Rank 0] ProcessGroupNCCL initialized with following options:
NCCL_ASYNC_ERROR_HANDLING: 1
NCCL_BLOCKING_WAIT: 0
TIMEOUT(ms): 3600000
USE_HIGH_PRIORITY_STREAM: 0
NCCL_DEBUG: UNSET
I1017 06:19:56.048684 13430 ProcessGroupNCCL.cpp:601] [Rank 0] NCCL watchdog thread started!
10/17/2022 06:19:56 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: True, 16-bits training: False
10/17/2022 06:19:57 - INFO - __main__ - classifier: token
hidden_size: 768
patches:
  size: !!python/tuple
  - 16
  - 16
representation_size: null
transformer:
  attention_dropout_rate: 0.0
  dropout_rate: 0.1
  mlp_dim: 3072
  num_heads: 12
  num_layers: 12

10/17/2022 06:19:57 - INFO - __main__ - Training parameters Namespace(dataset='cifar10', decay_type='cosine', device=device(type='cuda', index=0), eval_batch_size=64, eval_every=100, fp16=False, fp16_opt_level='O2', gradient_accumulation_steps=1, img_size=224, learning_rate=0.03, local_rank=0, loss_scale=0, max_grad_norm=1.0, model_type='ViT-B_16', n_gpu=1, name='cifar10-100_500', num_steps=500, output_dir='output', pretrained_dir='checkpoint/ViT-B_16.npz', seed=42, train_batch_size=64, warmup_steps=500, weight_decay=0)
10/17/2022 06:19:57 - INFO - __main__ - Total Parameter: 	85.8M
85.806346
Files already downloaded and verified
Files already downloaded and verified
I1017 06:19:59.279080 13361 ProcessGroupNCCL.cpp:1669] Rank 0 using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device.
10/17/2022 06:19:59 - INFO - __main__ - ***** Running training *****
10/17/2022 06:19:59 - INFO - __main__ -   Total optimization steps = 500
10/17/2022 06:19:59 - INFO - __main__ -   Instantaneous batch size per GPU = 64
10/17/2022 06:19:59 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 64
10/17/2022 06:19:59 - INFO - __main__ -   Gradient Accumulation steps = 1

Training (X / X Steps) (loss=X.X):   0%|| 0/782 [00:00<?, ?it/s]/usr/local/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)

Training (1 / 500 Steps) (loss=2.47201):   0%|| 0/782 [00:11<?, ?it/s]/usr/local/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:247: UserWarning: To get the last learning rate computed by the scheduler, please use `get_last_lr()`.
  warnings.warn("To get the last learning rate computed by the scheduler, "

Training (1 / 500 Steps) (loss=2.47201):   0%|| 1/782 [00:11<2:25:48, 11.20s/it]
Training (2 / 500 Steps) (loss=2.35605):   0%|| 1/782 [00:12<2:25:48, 11.20s/it]
Training (2 / 500 Steps) (loss=2.35605):   0%|| 2/782 [00:12<1:08:18,  5.25s/it]
Training (3 / 500 Steps) (loss=2.32266):   0%|| 2/782 [00:13<1:08:18,  5.25s/it]
Training (3 / 500 Steps) (loss=2.32266):   0%|| 3/782 [00:13<43:16,  3.33s/it]  
Training (4 / 500 Steps) (loss=2.42154):   0%|| 3/782 [00:14<43:16,  3.33s/it]
Training (4 / 500 Steps) (loss=2.42154):   1%|| 4/782 [00:14<31:31,  2.43s/it]
Training (5 / 500 Steps) (loss=2.33781):   1%|| 4/782 [00:15<31:31,  2.43s/it]
Training (5 / 500 Steps) (loss=2.33781):   1%|| 5/782 [00:15<25:02,  1.93s/it]
Training (6 / 500 Steps) (loss=2.35648):   1%|| 5/782 [00:16<25:02,  1.93s/it]
Training (6 / 500 Steps) (loss=2.35648):   1%|| 6/782 [00:16<21:09,  1.64s/it]
Training (7 / 500 Steps) (loss=2.44519):   1%|| 6/782 [00:17<21:09,  1.64s/it]
Training (7 / 500 Steps) (loss=2.44519):   1%|| 7/782 [00:17<18:40,  1.45s/it]
Training (8 / 500 Steps) (loss=2.40249):   1%|| 7/782 [00:18<18:40,  1.45s/it]
Training (8 / 500 Steps) (loss=2.40249):   1%|| 8/782 [00:18<17:03,  1.32s/it]
Training (9 / 500 Steps) (loss=2.36042):   1%|| 8/782 [00:19<17:03,  1.32s/it]
Training (9 / 500 Steps) (loss=2.36042):   1%|| 9/782 [00:19<15:57,  1.24s/it]
Training (10 / 500 Steps) (loss=2.39406):   1%|| 9/782 [00:20<15:57,  1.24s/it]
Training (10 / 500 Steps) (loss=2.39406):   1%|| 10/782 [00:20<15:17,  1.19s/it]
Training (11 / 500 Steps) (loss=2.47035):   1%|| 10/782 [00:21<15:17,  1.19s/it]
Training (11 / 500 Steps) (loss=2.47035):   1%|| 11/782 [00:21<14:44,  1.15s/it]
Training (12 / 500 Steps) (loss=2.31589):   1%|| 11/782 [00:22<14:44,  1.15s/it]
Training (12 / 500 Steps) (loss=2.31589):   2%|| 12/782 [00:22<14:23,  1.12s/it]
Training (13 / 500 Steps) (loss=2.34474):   2%|| 12/782 [00:23<14:23,  1.12s/it]
Training (13 / 500 Steps) (loss=2.34474):   2%|| 13/782 [00:23<14:05,  1.10s/it]
Training (14 / 500 Steps) (loss=2.40814):   2%|| 13/782 [00:24<14:05,  1.10s/it]
Training (14 / 500 Steps) (loss=2.40814):   2%|| 14/782 [00:24<13:55,  1.09s/it]
Training (15 / 500 Steps) (loss=2.34197):   2%|| 14/782 [00:26<13:55,  1.09s/it]
Training (15 / 500 Steps) (loss=2.34197):   2%|| 15/782 [00:26<13:45,  1.08s/it]
Training (16 / 500 Steps) (loss=2.39720):   2%|| 15/782 [00:27<13:45,  1.08s/it]
Training (16 / 500 Steps) (loss=2.39720):   2%|| 16/782 [00:27<13:38,  1.07s/it]
Training (17 / 500 Steps) (loss=2.38188):   2%|| 16/782 [00:28<13:38,  1.07s/it]
Training (17 / 500 Steps) (loss=2.38188):   2%|| 17/782 [00:28<13:32,  1.06s/it]
Training (18 / 500 Steps) (loss=2.32166):   2%|| 17/782 [00:29<13:32,  1.06s/it]
Training (18 / 500 Steps) (loss=2.32166):   2%|| 18/782 [00:29<13:31,  1.06s/it]
Training (19 / 500 Steps) (loss=2.35304):   2%|| 18/782 [00:30<13:31,  1.06s/it]
Training (19 / 500 Steps) (loss=2.35304):   2%|| 19/782 [00:30<13:28,  1.06s/it]
Training (20 / 500 Steps) (loss=2.17871):   2%|| 19/782 [00:31<13:28,  1.06s/it]
Training (20 / 500 Steps) (loss=2.17871):   3%|| 20/782 [00:31<13:26,  1.06s/it]
Training (21 / 500 Steps) (loss=2.28664):   3%|| 20/782 [00:32<13:26,  1.06s/it]
Training (21 / 500 Steps) (loss=2.28664):   3%|| 21/782 [00:32<13:25,  1.06s/it]
Training (22 / 500 Steps) (loss=2.30553):   3%|| 21/782 [00:33<13:25,  1.06s/it]
Training (22 / 500 Steps) (loss=2.30553):   3%|| 22/782 [00:33<13:23,  1.06s/it]
Training (23 / 500 Steps) (loss=2.28080):   3%|| 22/782 [00:34<13:23,  1.06s/it]
Training (23 / 500 Steps) (loss=2.28080):   3%|| 23/782 [00:34<13:21,  1.06s/it]
Training (24 / 500 Steps) (loss=2.28925):   3%|| 23/782 [00:35<13:21,  1.06s/it]
Training (24 / 500 Steps) (loss=2.28925):   3%|| 24/782 [00:35<13:18,  1.05s/it]
Training (25 / 500 Steps) (loss=2.27858):   3%|| 24/782 [00:36<13:18,  1.05s/it]
Training (25 / 500 Steps) (loss=2.27858):   3%|| 25/782 [00:36<13:18,  1.05s/it]
Training (26 / 500 Steps) (loss=2.26885):   3%|| 25/782 [00:37<13:18,  1.05s/it]
Training (26 / 500 Steps) (loss=2.26885):   3%|| 26/782 [00:37<13:17,  1.05s/it]
Training (27 / 500 Steps) (loss=2.10145):   3%|| 26/782 [00:38<13:17,  1.05s/it]
Training (27 / 500 Steps) (loss=2.10145):   3%|| 27/782 [00:38<13:15,  1.05s/it]
Training (28 / 500 Steps) (loss=2.27403):   3%|| 27/782 [00:39<13:15,  1.05s/it]
Training (28 / 500 Steps) (loss=2.27403):   4%|| 28/782 [00:39<13:13,  1.05s/it]
Training (29 / 500 Steps) (loss=2.16507):   4%|| 28/782 [00:40<13:13,  1.05s/it]
Training (29 / 500 Steps) (loss=2.16507):   4%|| 29/782 [00:40<13:11,  1.05s/it]
Training (30 / 500 Steps) (loss=2.12302):   4%|| 29/782 [00:41<13:11,  1.05s/it]
Training (30 / 500 Steps) (loss=2.12302):   4%|| 30/782 [00:41<13:10,  1.05s/it]
Training (31 / 500 Steps) (loss=2.33195):   4%|| 30/782 [00:42<13:10,  1.05s/it]
Training (31 / 500 Steps) (loss=2.33195):   4%|| 31/782 [00:42<13:10,  1.05s/it]
Training (32 / 500 Steps) (loss=2.18868):   4%|| 31/782 [00:43<13:10,  1.05s/it]
Training (32 / 500 Steps) (loss=2.18868):   4%|| 32/782 [00:43<13:08,  1.05s/it]
Training (33 / 500 Steps) (loss=2.12410):   4%|| 32/782 [00:44<13:08,  1.05s/it]
Training (33 / 500 Steps) (loss=2.12410):   4%|| 33/782 [00:44<13:06,  1.05s/it]
Training (34 / 500 Steps) (loss=2.16164):   4%|| 33/782 [00:46<13:06,  1.05s/it]
Training (34 / 500 Steps) (loss=2.16164):   4%|| 34/782 [00:46<13:06,  1.05s/it]
Training (35 / 500 Steps) (loss=2.15291):   4%|| 34/782 [00:47<13:06,  1.05s/it]
Training (35 / 500 Steps) (loss=2.15291):   4%|| 35/782 [00:47<13:05,  1.05s/it]
Training (36 / 500 Steps) (loss=2.15971):   4%|| 35/782 [00:48<13:05,  1.05s/it]
Training (36 / 500 Steps) (loss=2.15971):   5%|| 36/782 [00:48<13:05,  1.05s/it]
Training (37 / 500 Steps) (loss=2.21324):   5%|| 36/782 [00:49<13:05,  1.05s/it]
Training (37 / 500 Steps) (loss=2.21324):   5%|| 37/782 [00:49<13:03,  1.05s/it]
Training (38 / 500 Steps) (loss=2.06910):   5%|| 37/782 [00:50<13:03,  1.05s/it]
Training (38 / 500 Steps) (loss=2.06910):   5%|| 38/782 [00:50<13:02,  1.05s/it]
Training (39 / 500 Steps) (loss=2.17029):   5%|| 38/782 [00:51<13:02,  1.05s/it]
Training (39 / 500 Steps) (loss=2.17029):   5%|| 39/782 [00:51<13:01,  1.05s/it]
Training (40 / 500 Steps) (loss=2.05514):   5%|| 39/782 [00:52<13:01,  1.05s/it]
Training (40 / 500 Steps) (loss=2.05514):   5%|| 40/782 [00:52<13:00,  1.05s/it]
Training (41 / 500 Steps) (loss=2.09406):   5%|| 40/782 [00:53<13:00,  1.05s/it]
Training (41 / 500 Steps) (loss=2.09406):   5%|| 41/782 [00:53<13:01,  1.05s/it]
Training (42 / 500 Steps) (loss=2.09108):   5%|| 41/782 [00:54<13:01,  1.05s/it]
Training (42 / 500 Steps) (loss=2.09108):   5%|| 42/782 [00:54<13:00,  1.05s/it]
Training (43 / 500 Steps) (loss=2.06835):   5%|| 42/782 [00:55<13:00,  1.05s/it]
Training (43 / 500 Steps) (loss=2.06835):   5%|| 43/782 [00:55<12:59,  1.06s/it]
Training (44 / 500 Steps) (loss=2.20062):   5%|| 43/782 [00:56<12:59,  1.06s/it]
Training (44 / 500 Steps) (loss=2.20062):   6%|| 44/782 [00:56<12:58,  1.05s/it]
Training (45 / 500 Steps) (loss=2.04650):   6%|| 44/782 [00:57<12:58,  1.05s/it]
Training (45 / 500 Steps) (loss=2.04650):   6%|| 45/782 [00:57<12:58,  1.06s/it]
Training (46 / 500 Steps) (loss=2.09203):   6%|| 45/782 [00:58<12:58,  1.06s/it]
Training (46 / 500 Steps) (loss=2.09203):   6%|| 46/782 [00:58<12:57,  1.06s/it]
Training (47 / 500 Steps) (loss=2.09163):   6%|| 46/782 [00:59<12:57,  1.06s/it]
Training (47 / 500 Steps) (loss=2.09163):   6%|| 47/782 [00:59<12:54,  1.05s/it]
Training (48 / 500 Steps) (loss=2.19403):   6%|| 47/782 [01:00<12:54,  1.05s/it]
Training (48 / 500 Steps) (loss=2.19403):   6%|| 48/782 [01:00<12:53,  1.05s/it]
Training (49 / 500 Steps) (loss=2.22126):   6%|| 48/782 [01:01<12:53,  1.05s/it]
Training (49 / 500 Steps) (loss=2.22126):   6%|| 49/782 [01:01<12:53,  1.05s/it]
Training (50 / 500 Steps) (loss=2.23522):   6%|| 49/782 [01:02<12:53,  1.05s/it]
Training (50 / 500 Steps) (loss=2.23522):   6%|| 50/782 [01:02<12:51,  1.05s/it]
Training (51 / 500 Steps) (loss=2.20099):   6%|| 50/782 [01:03<12:51,  1.05s/it]
Training (51 / 500 Steps) (loss=2.20099):   7%|| 51/782 [01:03<12:49,  1.05s/it]
Training (52 / 500 Steps) (loss=2.14321):   7%|| 51/782 [01:04<12:49,  1.05s/it]
Training (52 / 500 Steps) (loss=2.14321):   7%|| 52/782 [01:04<12:50,  1.05s/it]
Training (53 / 500 Steps) (loss=1.98711):   7%|| 52/782 [01:06<12:50,  1.05s/it]
Training (53 / 500 Steps) (loss=1.98711):   7%|| 53/782 [01:06<12:49,  1.06s/it]
Training (54 / 500 Steps) (loss=2.13499):   7%|| 53/782 [01:07<12:49,  1.06s/it]
Training (54 / 500 Steps) (loss=2.13499):   7%|| 54/782 [01:07<12:48,  1.06s/it]
Training (55 / 500 Steps) (loss=2.09108):   7%|| 54/782 [01:08<12:48,  1.06s/it]
Training (55 / 500 Steps) (loss=2.09108):   7%|| 55/782 [01:08<12:47,  1.06s/it]
Training (56 / 500 Steps) (loss=2.20803):   7%|| 55/782 [01:09<12:47,  1.06s/it]
Training (56 / 500 Steps) (loss=2.20803):   7%|| 56/782 [01:09<12:47,  1.06s/it]
Training (57 / 500 Steps) (loss=2.26612):   7%|| 56/782 [01:10<12:47,  1.06s/it]
Training (57 / 500 Steps) (loss=2.26612):   7%|| 57/782 [01:10<12:46,  1.06s/it]
Training (58 / 500 Steps) (loss=2.17287):   7%|| 57/782 [01:11<12:46,  1.06s/it]
Training (58 / 500 Steps) (loss=2.17287):   7%|| 58/782 [01:11<12:44,  1.06s/it]
Training (59 / 500 Steps) (loss=2.12674):   7%|| 58/782 [01:12<12:44,  1.06s/it]
Training (59 / 500 Steps) (loss=2.12674):   8%|| 59/782 [01:12<12:43,  1.06s/it]
Training (60 / 500 Steps) (loss=2.15639):   8%|| 59/782 [01:13<12:43,  1.06s/it]
Training (60 / 500 Steps) (loss=2.15639):   8%|| 60/782 [01:13<12:42,  1.06s/it]
Training (61 / 500 Steps) (loss=2.04263):   8%|| 60/782 [01:14<12:42,  1.06s/it]
Training (61 / 500 Steps) (loss=2.04263):   8%|| 61/782 [01:14<12:41,  1.06s/it]
Training (62 / 500 Steps) (loss=2.40009):   8%|| 61/782 [01:15<12:41,  1.06s/it]
Training (62 / 500 Steps) (loss=2.40009):   8%|| 62/782 [01:15<12:39,  1.05s/it]
Training (63 / 500 Steps) (loss=2.01202):   8%|| 62/782 [01:16<12:39,  1.05s/it]
Training (63 / 500 Steps) (loss=2.01202):   8%|| 63/782 [01:16<12:38,  1.05s/it]
Training (64 / 500 Steps) (loss=2.19738):   8%|| 63/782 [01:17<12:38,  1.05s/it]
Training (64 / 500 Steps) (loss=2.19738):   8%|| 64/782 [01:17<12:36,  1.05s/it]
Training (65 / 500 Steps) (loss=2.16155):   8%|| 64/782 [01:18<12:36,  1.05s/it]
Training (65 / 500 Steps) (loss=2.16155):   8%|| 65/782 [01:18<12:36,  1.05s/it]
Training (66 / 500 Steps) (loss=1.98733):   8%|| 65/782 [01:19<12:36,  1.05s/it]
Training (66 / 500 Steps) (loss=1.98733):   8%|| 66/782 [01:19<12:33,  1.05s/it]
Training (67 / 500 Steps) (loss=2.04643):   8%|| 66/782 [01:20<12:33,  1.05s/it]
Training (67 / 500 Steps) (loss=2.04643):   9%|| 67/782 [01:20<12:34,  1.06s/it]
Training (68 / 500 Steps) (loss=2.09209):   9%|| 67/782 [01:21<12:34,  1.06s/it]
Training (68 / 500 Steps) (loss=2.09209):   9%|| 68/782 [01:21<12:35,  1.06s/it]
Training (69 / 500 Steps) (loss=2.13990):   9%|| 68/782 [01:22<12:35,  1.06s/it]
Training (69 / 500 Steps) (loss=2.13990):   9%|| 69/782 [01:22<12:33,  1.06s/it]
Training (70 / 500 Steps) (loss=2.01035):   9%|| 69/782 [01:23<12:33,  1.06s/it]
Training (70 / 500 Steps) (loss=2.01035):   9%|| 70/782 [01:23<12:31,  1.06s/it]
Training (71 / 500 Steps) (loss=2.28099):   9%|| 70/782 [01:25<12:31,  1.06s/it]
Training (71 / 500 Steps) (loss=2.28099):   9%|| 71/782 [01:25<12:29,  1.05s/it]
Training (72 / 500 Steps) (loss=2.18922):   9%|| 71/782 [01:26<12:29,  1.05s/it]
Training (72 / 500 Steps) (loss=2.18922):   9%|| 72/782 [01:26<12:28,  1.05s/it]
Training (73 / 500 Steps) (loss=2.16948):   9%|| 72/782 [01:27<12:28,  1.05s/it]
Training (73 / 500 Steps) (loss=2.16948):   9%|| 73/782 [01:27<12:27,  1.05s/it]
Training (74 / 500 Steps) (loss=2.04931):   9%|| 73/782 [01:28<12:27,  1.05s/it]
Training (74 / 500 Steps) (loss=2.04931):   9%|| 74/782 [01:28<12:27,  1.06s/it]
Training (75 / 500 Steps) (loss=2.00235):   9%|| 74/782 [01:29<12:27,  1.06s/it]
Training (75 / 500 Steps) (loss=2.00235):  10%|| 75/782 [01:29<12:27,  1.06s/it]
Training (76 / 500 Steps) (loss=2.10624):  10%|| 75/782 [01:30<12:27,  1.06s/it]
Training (76 / 500 Steps) (loss=2.10624):  10%|| 76/782 [01:30<12:25,  1.06s/it]
Training (77 / 500 Steps) (loss=1.97734):  10%|| 76/782 [01:31<12:25,  1.06s/it]
Training (77 / 500 Steps) (loss=1.97734):  10%|| 77/782 [01:31<12:24,  1.06s/it]
Training (78 / 500 Steps) (loss=2.28222):  10%|| 77/782 [01:32<12:24,  1.06s/it]
Training (78 / 500 Steps) (loss=2.28222):  10%|| 78/782 [01:32<12:23,  1.06s/it]
Training (79 / 500 Steps) (loss=2.08920):  10%|| 78/782 [01:33<12:23,  1.06s/it]
Training (79 / 500 Steps) (loss=2.08920):  10%|| 79/782 [01:33<12:23,  1.06s/it]
Training (80 / 500 Steps) (loss=2.09822):  10%|| 79/782 [01:34<12:23,  1.06s/it]
Training (80 / 500 Steps) (loss=2.09822):  10%|| 80/782 [01:34<12:21,  1.06s/it]
Training (81 / 500 Steps) (loss=2.22449):  10%|| 80/782 [01:35<12:21,  1.06s/it]
Training (81 / 500 Steps) (loss=2.22449):  10%|| 81/782 [01:35<12:20,  1.06s/it]
Training (82 / 500 Steps) (loss=2.18978):  10%|| 81/782 [01:36<12:20,  1.06s/it]
Training (82 / 500 Steps) (loss=2.18978):  10%|| 82/782 [01:36<12:19,  1.06s/it]
Training (83 / 500 Steps) (loss=2.01634):  10%|| 82/782 [01:37<12:19,  1.06s/it]
Training (83 / 500 Steps) (loss=2.01634):  11%|| 83/782 [01:37<12:18,  1.06s/it]
Training (84 / 500 Steps) (loss=1.96343):  11%|| 83/782 [01:38<12:18,  1.06s/it]
Training (84 / 500 Steps) (loss=1.96343):  11%|| 84/782 [01:38<12:17,  1.06s/it]
Training (85 / 500 Steps) (loss=2.29350):  11%|| 84/782 [01:39<12:17,  1.06s/it]
Training (85 / 500 Steps) (loss=2.29350):  11%|| 85/782 [01:39<12:16,  1.06s/it]
Training (86 / 500 Steps) (loss=2.13883):  11%|| 85/782 [01:40<12:16,  1.06s/it]
Training (86 / 500 Steps) (loss=2.13883):  11%|| 86/782 [01:40<12:16,  1.06s/it]
Training (87 / 500 Steps) (loss=2.12465):  11%|| 86/782 [01:41<12:16,  1.06s/it]
Training (87 / 500 Steps) (loss=2.12465):  11%|| 87/782 [01:41<12:14,  1.06s/it]
Training (88 / 500 Steps) (loss=2.01343):  11%|| 87/782 [01:42<12:14,  1.06s/it]
Training (88 / 500 Steps) (loss=2.01343):  11%|| 88/782 [01:43<12:11,  1.05s/it]
Training (89 / 500 Steps) (loss=2.16396):  11%|| 88/782 [01:44<12:11,  1.05s/it]
Training (89 / 500 Steps) (loss=2.16396):  11%|| 89/782 [01:44<12:10,  1.05s/it]
Training (90 / 500 Steps) (loss=2.15292):  11%|| 89/782 [01:45<12:10,  1.05s/it]
Training (90 / 500 Steps) (loss=2.15292):  12%|| 90/782 [01:45<12:09,  1.05s/it]
Training (91 / 500 Steps) (loss=2.18620):  12%|| 90/782 [01:46<12:09,  1.05s/it]
Training (91 / 500 Steps) (loss=2.18620):  12%|| 91/782 [01:46<12:09,  1.06s/it]
Training (92 / 500 Steps) (loss=2.01675):  12%|| 91/782 [01:47<12:09,  1.06s/it]
Training (92 / 500 Steps) (loss=2.01675):  12%|| 92/782 [01:47<12:07,  1.05s/it]
Training (93 / 500 Steps) (loss=2.11515):  12%|| 92/782 [01:48<12:07,  1.05s/it]
Training (93 / 500 Steps) (loss=2.11515):  12%|| 93/782 [01:48<12:05,  1.05s/it]
Training (94 / 500 Steps) (loss=2.02295):  12%|| 93/782 [01:49<12:05,  1.05s/it]
Training (94 / 500 Steps) (loss=2.02295):  12%|| 94/782 [01:49<12:06,  1.06s/it]
Training (95 / 500 Steps) (loss=1.96398):  12%|| 94/782 [01:50<12:06,  1.06s/it]
Training (95 / 500 Steps) (loss=1.96398):  12%|| 95/782 [01:50<12:04,  1.05s/it]
Training (96 / 500 Steps) (loss=2.19677):  12%|| 95/782 [01:51<12:04,  1.05s/it]
Training (96 / 500 Steps) (loss=2.19677):  12%|| 96/782 [01:51<12:02,  1.05s/it]
Training (97 / 500 Steps) (loss=2.02343):  12%|| 96/782 [01:52<12:02,  1.05s/it]
Training (97 / 500 Steps) (loss=2.02343):  12%|| 97/782 [01:52<12:01,  1.05s/it]
Training (98 / 500 Steps) (loss=2.05613):  12%|| 97/782 [01:53<12:01,  1.05s/it]
Training (98 / 500 Steps) (loss=2.05613):  13%|| 98/782 [01:53<11:59,  1.05s/it]
Training (99 / 500 Steps) (loss=2.11409):  13%|| 98/782 [01:54<11:59,  1.05s/it]
Training (99 / 500 Steps) (loss=2.11409):  13%|| 99/782 [01:54<11:58,  1.05s/it]
Training (100 / 500 Steps) (loss=1.89570):  13%|| 99/782 [01:55<11:58,  1.05s/it]10/17/2022 06:21:55 - INFO - __main__ - ***** Running Validation *****
10/17/2022 06:21:55 - INFO - __main__ -   Num steps = 157
10/17/2022 06:21:55 - INFO - __main__ -   Batch size = 64


Validating... (loss=X.X):   0%|| 0/157 [00:00<?, ?it/s]

Validating... (loss=1.92346):   0%|| 0/157 [00:01<?, ?it/s]

Validating... (loss=1.92346):   1%|| 1/157 [00:01<02:39,  1.02s/it]

Validating... (loss=1.85576):   1%|| 1/157 [00:01<02:39,  1.02s/it]

Validating... (loss=1.85576):   1%|| 2/157 [00:01<01:42,  1.52it/s]

Validating... (loss=2.13923):   1%|| 2/157 [00:01<01:42,  1.52it/s]

Validating... (loss=2.13923):   2%|| 3/157 [00:01<01:17,  1.97it/s]

Validating... (loss=2.00285):   2%|| 3/157 [00:02<01:17,  1.97it/s]

Validating... (loss=2.00285):   3%|| 4/157 [00:02<01:06,  2.30it/s]

Validating... (loss=1.81807):   3%|| 4/157 [00:02<01:06,  2.30it/s]

Validating... (loss=1.81807):   3%|| 5/157 [00:02<01:00,  2.53it/s]

Validating... (loss=2.04288):   3%|| 5/157 [00:02<01:00,  2.53it/s]

Validating... (loss=2.04288):   4%|| 6/157 [00:02<00:55,  2.70it/s]

Validating... (loss=1.99619):   4%|| 6/157 [00:03<00:55,  2.70it/s]

Validating... (loss=1.99619):   4%|| 7/157 [00:03<00:53,  2.81it/s]

Validating... (loss=1.94766):   4%|| 7/157 [00:03<00:53,  2.81it/s]

Validating... (loss=1.94766):   5%|| 8/157 [00:03<00:51,  2.89it/s]

Validating... (loss=2.04718):   5%|| 8/157 [00:03<00:51,  2.89it/s]

Validating... (loss=2.04718):   6%|| 9/157 [00:03<00:50,  2.93it/s]

Validating... (loss=1.96493):   6%|| 9/157 [00:04<00:50,  2.93it/s]

Validating... (loss=1.96493):   6%|| 10/157 [00:04<00:49,  2.99it/s]

Validating... (loss=2.06218):   6%|| 10/157 [00:04<00:49,  2.99it/s]

Validating... (loss=2.06218):   7%|| 11/157 [00:04<00:49,  2.95it/s]

Validating... (loss=1.99470):   7%|| 11/157 [00:04<00:49,  2.95it/s]

Validating... (loss=1.99470):   8%|| 12/157 [00:04<00:48,  3.00it/s]

Validating... (loss=2.09224):   8%|| 12/157 [00:05<00:48,  3.00it/s]

Validating... (loss=2.09224):   8%|| 13/157 [00:05<00:47,  3.05it/s]

Validating... (loss=1.77962):   8%|| 13/157 [00:05<00:47,  3.05it/s]

Validating... (loss=1.77962):   9%|| 14/157 [00:05<00:46,  3.08it/s]

Validating... (loss=1.96329):   9%|| 14/157 [00:05<00:46,  3.08it/s]

Validating... (loss=1.96329):  10%|| 15/157 [00:05<00:45,  3.11it/s]

Validating... (loss=1.89406):  10%|| 15/157 [00:05<00:45,  3.11it/s]

Validating... (loss=1.89406):  10%|| 16/157 [00:05<00:45,  3.12it/s]

Validating... (loss=2.04001):  10%|| 16/157 [00:06<00:45,  3.12it/s]

Validating... (loss=2.04001):  11%|| 17/157 [00:06<00:44,  3.13it/s]

Validating... (loss=1.92475):  11%|| 17/157 [00:06<00:44,  3.13it/s]

Validating... (loss=1.92475):  11%|| 18/157 [00:06<00:44,  3.14it/s]

Validating... (loss=1.87518):  11%|| 18/157 [00:06<00:44,  3.14it/s]

Validating... (loss=1.87518):  12%|| 19/157 [00:06<00:43,  3.14it/s]

Validating... (loss=2.09969):  12%|| 19/157 [00:07<00:43,  3.14it/s]

Validating... (loss=2.09969):  13%|| 20/157 [00:07<00:43,  3.15it/s]

Validating... (loss=2.09977):  13%|| 20/157 [00:07<00:43,  3.15it/s]

Validating... (loss=2.09977):  13%|| 21/157 [00:07<00:43,  3.15it/s]

Validating... (loss=1.94589):  13%|| 21/157 [00:07<00:43,  3.15it/s]

Validating... (loss=1.94589):  14%|| 22/157 [00:07<00:42,  3.15it/s]

Validating... (loss=1.79975):  14%|| 22/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.79975):  15%|| 23/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.84900):  15%|| 23/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.84900):  15%|| 24/157 [00:08<00:42,  3.14it/s]

Validating... (loss=2.08427):  15%|| 24/157 [00:08<00:42,  3.14it/s]

Validating... (loss=2.08427):  16%|| 25/157 [00:08<00:41,  3.15it/s]

Validating... (loss=2.03119):  16%|| 25/157 [00:09<00:41,  3.15it/s]

Validating... (loss=2.03119):  17%|| 26/157 [00:09<00:41,  3.14it/s]

Validating... (loss=1.91178):  17%|| 26/157 [00:09<00:41,  3.14it/s]

Validating... (loss=1.91178):  17%|| 27/157 [00:09<00:41,  3.13it/s]

Validating... (loss=1.91305):  17%|| 27/157 [00:09<00:41,  3.13it/s]

Validating... (loss=1.91305):  18%|| 28/157 [00:09<00:41,  3.14it/s]

Validating... (loss=1.99107):  18%|| 28/157 [00:10<00:41,  3.14it/s]

Validating... (loss=1.99107):  18%|| 29/157 [00:10<00:40,  3.13it/s]

Validating... (loss=2.02487):  18%|| 29/157 [00:10<00:40,  3.13it/s]

Validating... (loss=2.02487):  19%|| 30/157 [00:10<00:40,  3.14it/s]

Validating... (loss=1.99627):  19%|| 30/157 [00:10<00:40,  3.14it/s]

Validating... (loss=1.99627):  20%|| 31/157 [00:10<00:40,  3.15it/s]

Validating... (loss=2.09802):  20%|| 31/157 [00:11<00:40,  3.15it/s]

Validating... (loss=2.09802):  20%|| 32/157 [00:11<00:39,  3.15it/s]

Validating... (loss=2.04834):  20%|| 32/157 [00:11<00:39,  3.15it/s]

Validating... (loss=2.04834):  21%|| 33/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.98432):  21%|| 33/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.98432):  22%|| 34/157 [00:11<00:38,  3.15it/s]

Validating... (loss=1.87262):  22%|| 34/157 [00:11<00:38,  3.15it/s]

Validating... (loss=1.87262):  22%|| 35/157 [00:11<00:38,  3.16it/s]

Validating... (loss=1.95246):  22%|| 35/157 [00:12<00:38,  3.16it/s]

Validating... (loss=1.95246):  23%|| 36/157 [00:12<00:38,  3.15it/s]

Validating... (loss=1.96603):  23%|| 36/157 [00:12<00:38,  3.15it/s]

Validating... (loss=1.96603):  24%|| 37/157 [00:12<00:38,  3.16it/s]

Validating... (loss=2.05345):  24%|| 37/157 [00:12<00:38,  3.16it/s]

Validating... (loss=2.05345):  24%|| 38/157 [00:12<00:37,  3.16it/s]

Validating... (loss=1.83681):  24%|| 38/157 [00:13<00:37,  3.16it/s]

Validating... (loss=1.83681):  25%|| 39/157 [00:13<00:37,  3.16it/s]

Validating... (loss=2.02129):  25%|| 39/157 [00:13<00:37,  3.16it/s]

Validating... (loss=2.02129):  25%|| 40/157 [00:13<00:37,  3.16it/s]

Validating... (loss=1.96609):  25%|| 40/157 [00:13<00:37,  3.16it/s]

Validating... (loss=1.96609):  26%|| 41/157 [00:13<00:36,  3.16it/s]

Validating... (loss=1.86006):  26%|| 41/157 [00:14<00:36,  3.16it/s]

Validating... (loss=1.86006):  27%|| 42/157 [00:14<00:36,  3.16it/s]

Validating... (loss=1.95873):  27%|| 42/157 [00:14<00:36,  3.16it/s]

Validating... (loss=1.95873):  27%|| 43/157 [00:14<00:36,  3.16it/s]

Validating... (loss=1.95075):  27%|| 43/157 [00:14<00:36,  3.16it/s]

Validating... (loss=1.95075):  28%|| 44/157 [00:14<00:35,  3.16it/s]

Validating... (loss=2.16269):  28%|| 44/157 [00:15<00:35,  3.16it/s]

Validating... (loss=2.16269):  29%|| 45/157 [00:15<00:35,  3.16it/s]

Validating... (loss=1.95686):  29%|| 45/157 [00:15<00:35,  3.16it/s]

Validating... (loss=1.95686):  29%|| 46/157 [00:15<00:35,  3.16it/s]

Validating... (loss=2.02192):  29%|| 46/157 [00:15<00:35,  3.16it/s]

Validating... (loss=2.02192):  30%|| 47/157 [00:15<00:34,  3.16it/s]

Validating... (loss=1.91004):  30%|| 47/157 [00:16<00:34,  3.16it/s]

Validating... (loss=1.91004):  31%|| 48/157 [00:16<00:34,  3.16it/s]

Validating... (loss=2.06099):  31%|| 48/157 [00:16<00:34,  3.16it/s]

Validating... (loss=2.06099):  31%|| 49/157 [00:16<00:34,  3.16it/s]

Validating... (loss=1.93258):  31%|| 49/157 [00:16<00:34,  3.16it/s]

Validating... (loss=1.93258):  32%|| 50/157 [00:16<00:33,  3.16it/s]

Validating... (loss=2.13066):  32%|| 50/157 [00:17<00:33,  3.16it/s]

Validating... (loss=2.13066):  32%|| 51/157 [00:17<00:33,  3.16it/s]

Validating... (loss=2.10600):  32%|| 51/157 [00:17<00:33,  3.16it/s]

Validating... (loss=2.10600):  33%|| 52/157 [00:17<00:33,  3.16it/s]

Validating... (loss=1.86150):  33%|| 52/157 [00:17<00:33,  3.16it/s]

Validating... (loss=1.86150):  34%|| 53/157 [00:17<00:32,  3.15it/s]

Validating... (loss=2.02299):  34%|| 53/157 [00:18<00:32,  3.15it/s]

Validating... (loss=2.02299):  34%|| 54/157 [00:18<00:32,  3.15it/s]

Validating... (loss=1.97613):  34%|| 54/157 [00:18<00:32,  3.15it/s]

Validating... (loss=1.97613):  35%|| 55/157 [00:18<00:32,  3.13it/s]

Validating... (loss=1.95514):  35%|| 55/157 [00:18<00:32,  3.13it/s]

Validating... (loss=1.95514):  36%|| 56/157 [00:18<00:32,  3.13it/s]

Validating... (loss=2.01864):  36%|| 56/157 [00:18<00:32,  3.13it/s]

Validating... (loss=2.01864):  36%|| 57/157 [00:18<00:31,  3.14it/s]

Validating... (loss=1.88329):  36%|| 57/157 [00:19<00:31,  3.14it/s]

Validating... (loss=1.88329):  37%|| 58/157 [00:19<00:31,  3.14it/s]

Validating... (loss=2.00975):  37%|| 58/157 [00:19<00:31,  3.14it/s]

Validating... (loss=2.00975):  38%|| 59/157 [00:19<00:31,  3.14it/s]

Validating... (loss=2.04747):  38%|| 59/157 [00:19<00:31,  3.14it/s]

Validating... (loss=2.04747):  38%|| 60/157 [00:19<00:30,  3.15it/s]

Validating... (loss=1.90604):  38%|| 60/157 [00:20<00:30,  3.15it/s]

Validating... (loss=1.90604):  39%|| 61/157 [00:20<00:30,  3.15it/s]

Validating... (loss=2.04378):  39%|| 61/157 [00:20<00:30,  3.15it/s]

Validating... (loss=2.04378):  39%|| 62/157 [00:20<00:30,  3.15it/s]

Validating... (loss=1.93567):  39%|| 62/157 [00:20<00:30,  3.15it/s]

Validating... (loss=1.93567):  40%|| 63/157 [00:20<00:29,  3.15it/s]

Validating... (loss=2.09785):  40%|| 63/157 [00:21<00:29,  3.15it/s]

Validating... (loss=2.09785):  41%|| 64/157 [00:21<00:29,  3.15it/s]

Validating... (loss=2.07291):  41%|| 64/157 [00:21<00:29,  3.15it/s]

Validating... (loss=2.07291):  41%|| 65/157 [00:21<00:29,  3.14it/s]

Validating... (loss=1.83118):  41%|| 65/157 [00:21<00:29,  3.14it/s]

Validating... (loss=1.83118):  42%|| 66/157 [00:21<00:29,  3.14it/s]

Validating... (loss=2.07153):  42%|| 66/157 [00:22<00:29,  3.14it/s]

Validating... (loss=2.07153):  43%|| 67/157 [00:22<00:28,  3.14it/s]

Validating... (loss=1.97771):  43%|| 67/157 [00:22<00:28,  3.14it/s]

Validating... (loss=1.97771):  43%|| 68/157 [00:22<00:28,  3.14it/s]

Validating... (loss=1.86945):  43%|| 68/157 [00:22<00:28,  3.14it/s]

Validating... (loss=1.86945):  44%|| 69/157 [00:22<00:27,  3.15it/s]

Validating... (loss=1.91860):  44%|| 69/157 [00:23<00:27,  3.15it/s]

Validating... (loss=1.91860):  45%|| 70/157 [00:23<00:27,  3.15it/s]

Validating... (loss=1.89369):  45%|| 70/157 [00:23<00:27,  3.15it/s]

Validating... (loss=1.89369):  45%|| 71/157 [00:23<00:27,  3.16it/s]

Validating... (loss=2.02136):  45%|| 71/157 [00:23<00:27,  3.16it/s]

Validating... (loss=2.02136):  46%|| 72/157 [00:23<00:26,  3.16it/s]

Validating... (loss=1.99442):  46%|| 72/157 [00:24<00:26,  3.16it/s]

Validating... (loss=1.99442):  46%|| 73/157 [00:24<00:26,  3.16it/s]

Validating... (loss=1.99283):  46%|| 73/157 [00:24<00:26,  3.16it/s]

Validating... (loss=1.99283):  47%|| 74/157 [00:24<00:26,  3.16it/s]

Validating... (loss=2.14320):  47%|| 74/157 [00:24<00:26,  3.16it/s]

Validating... (loss=2.14320):  48%|| 75/157 [00:24<00:25,  3.16it/s]

Validating... (loss=1.91913):  48%|| 75/157 [00:24<00:25,  3.16it/s]

Validating... (loss=1.91913):  48%|| 76/157 [00:24<00:25,  3.15it/s]

Validating... (loss=1.90266):  48%|| 76/157 [00:25<00:25,  3.15it/s]

Validating... (loss=1.90266):  49%|| 77/157 [00:25<00:25,  3.15it/s]

Validating... (loss=1.85545):  49%|| 77/157 [00:25<00:25,  3.15it/s]

Validating... (loss=1.85545):  50%|| 78/157 [00:25<00:25,  3.15it/s]

Validating... (loss=1.92508):  50%|| 78/157 [00:25<00:25,  3.15it/s]

Validating... (loss=1.92508):  50%|| 79/157 [00:25<00:24,  3.15it/s]

Validating... (loss=2.00989):  50%|| 79/157 [00:26<00:24,  3.15it/s]

Validating... (loss=2.00989):  51%|| 80/157 [00:26<00:24,  3.16it/s]

Validating... (loss=1.94975):  51%|| 80/157 [00:26<00:24,  3.16it/s]

Validating... (loss=1.94975):  52%|| 81/157 [00:26<00:24,  3.15it/s]

Validating... (loss=2.09903):  52%|| 81/157 [00:26<00:24,  3.15it/s]

Validating... (loss=2.09903):  52%|| 82/157 [00:26<00:23,  3.15it/s]

Validating... (loss=1.85245):  52%|| 82/157 [00:27<00:23,  3.15it/s]

Validating... (loss=1.85245):  53%|| 83/157 [00:27<00:23,  3.15it/s]

Validating... (loss=1.98482):  53%|| 83/157 [00:27<00:23,  3.15it/s]

Validating... (loss=1.98482):  54%|| 84/157 [00:27<00:23,  3.15it/s]

Validating... (loss=2.06334):  54%|| 84/157 [00:27<00:23,  3.15it/s]

Validating... (loss=2.06334):  54%|| 85/157 [00:27<00:22,  3.15it/s]

Validating... (loss=2.00812):  54%|| 85/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.00812):  55%|| 86/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.18557):  55%|| 86/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.18557):  55%|| 87/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.17719):  55%|| 87/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.17719):  56%|| 88/157 [00:28<00:21,  3.15it/s]

Validating... (loss=2.18422):  56%|| 88/157 [00:29<00:21,  3.15it/s]

Validating... (loss=2.18422):  57%|| 89/157 [00:29<00:21,  3.15it/s]

Validating... (loss=1.88398):  57%|| 89/157 [00:29<00:21,  3.15it/s]

Validating... (loss=1.88398):  57%|| 90/157 [00:29<00:21,  3.16it/s]

Validating... (loss=1.91310):  57%|| 90/157 [00:29<00:21,  3.16it/s]

Validating... (loss=1.91310):  58%|| 91/157 [00:29<00:20,  3.16it/s]

Validating... (loss=2.06439):  58%|| 91/157 [00:30<00:20,  3.16it/s]

Validating... (loss=2.06439):  59%|| 92/157 [00:30<00:20,  3.16it/s]

Validating... (loss=1.99535):  59%|| 92/157 [00:30<00:20,  3.16it/s]

Validating... (loss=1.99535):  59%|| 93/157 [00:30<00:20,  3.15it/s]

Validating... (loss=2.07609):  59%|| 93/157 [00:30<00:20,  3.15it/s]

Validating... (loss=2.07609):  60%|| 94/157 [00:30<00:19,  3.15it/s]

Validating... (loss=2.11147):  60%|| 94/157 [00:31<00:19,  3.15it/s]

Validating... (loss=2.11147):  61%|| 95/157 [00:31<00:19,  3.16it/s]

Validating... (loss=2.07927):  61%|| 95/157 [00:31<00:19,  3.16it/s]

Validating... (loss=2.07927):  61%|| 96/157 [00:31<00:19,  3.16it/s]

Validating... (loss=2.03061):  61%|| 96/157 [00:31<00:19,  3.16it/s]

Validating... (loss=2.03061):  62%|| 97/157 [00:31<00:19,  3.15it/s]

Validating... (loss=2.11180):  62%|| 97/157 [00:31<00:19,  3.15it/s]

Validating... (loss=2.11180):  62%|| 98/157 [00:31<00:18,  3.15it/s]

Validating... (loss=1.84654):  62%|| 98/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.84654):  63%|| 99/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.90744):  63%|| 99/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.90744):  64%|| 100/157 [00:32<00:18,  3.16it/s]

Validating... (loss=1.99561):  64%|| 100/157 [00:32<00:18,  3.16it/s]

Validating... (loss=1.99561):  64%|| 101/157 [00:32<00:17,  3.16it/s]

Validating... (loss=1.89722):  64%|| 101/157 [00:33<00:17,  3.16it/s]

Validating... (loss=1.89722):  65%|| 102/157 [00:33<00:17,  3.15it/s]

Validating... (loss=2.20840):  65%|| 102/157 [00:33<00:17,  3.15it/s]

Validating... (loss=2.20840):  66%|| 103/157 [00:33<00:17,  3.15it/s]

Validating... (loss=2.00845):  66%|| 103/157 [00:33<00:17,  3.15it/s]

Validating... (loss=2.00845):  66%|| 104/157 [00:33<00:16,  3.14it/s]

Validating... (loss=2.02808):  66%|| 104/157 [00:34<00:16,  3.14it/s]

Validating... (loss=2.02808):  67%|| 105/157 [00:34<00:16,  3.15it/s]

Validating... (loss=1.95273):  67%|| 105/157 [00:34<00:16,  3.15it/s]

Validating... (loss=1.95273):  68%|| 106/157 [00:34<00:16,  3.15it/s]

Validating... (loss=1.85539):  68%|| 106/157 [00:34<00:16,  3.15it/s]

Validating... (loss=1.85539):  68%|| 107/157 [00:34<00:15,  3.15it/s]

Validating... (loss=1.99144):  68%|| 107/157 [00:35<00:15,  3.15it/s]

Validating... (loss=1.99144):  69%|| 108/157 [00:35<00:15,  3.15it/s]

Validating... (loss=2.07735):  69%|| 108/157 [00:35<00:15,  3.15it/s]

Validating... (loss=2.07735):  69%|| 109/157 [00:35<00:15,  3.15it/s]

Validating... (loss=1.92596):  69%|| 109/157 [00:35<00:15,  3.15it/s]

Validating... (loss=1.92596):  70%|| 110/157 [00:35<00:14,  3.15it/s]

Validating... (loss=2.02804):  70%|| 110/157 [00:36<00:14,  3.15it/s]

Validating... (loss=2.02804):  71%|| 111/157 [00:36<00:14,  3.15it/s]

Validating... (loss=2.03279):  71%|| 111/157 [00:36<00:14,  3.15it/s]

Validating... (loss=2.03279):  71%|| 112/157 [00:36<00:14,  3.15it/s]

Validating... (loss=1.96027):  71%|| 112/157 [00:36<00:14,  3.15it/s]

Validating... (loss=1.96027):  72%|| 113/157 [00:36<00:13,  3.16it/s]

Validating... (loss=1.85829):  72%|| 113/157 [00:37<00:13,  3.16it/s]

Validating... (loss=1.85829):  73%|| 114/157 [00:37<00:13,  3.16it/s]

Validating... (loss=1.88288):  73%|| 114/157 [00:37<00:13,  3.16it/s]

Validating... (loss=1.88288):  73%|| 115/157 [00:37<00:13,  3.16it/s]

Validating... (loss=2.04349):  73%|| 115/157 [00:37<00:13,  3.16it/s]

Validating... (loss=2.04349):  74%|| 116/157 [00:37<00:12,  3.16it/s]

Validating... (loss=1.94756):  74%|| 116/157 [00:38<00:12,  3.16it/s]

Validating... (loss=1.94756):  75%|| 117/157 [00:38<00:12,  3.16it/s]

Validating... (loss=1.98604):  75%|| 117/157 [00:38<00:12,  3.16it/s]

Validating... (loss=1.98604):  75%|| 118/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.82725):  75%|| 118/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.82725):  76%|| 119/157 [00:38<00:12,  3.15it/s]

Validating... (loss=2.00784):  76%|| 119/157 [00:38<00:12,  3.15it/s]

Validating... (loss=2.00784):  76%|| 120/157 [00:38<00:11,  3.16it/s]

Validating... (loss=1.88801):  76%|| 120/157 [00:39<00:11,  3.16it/s]

Validating... (loss=1.88801):  77%|| 121/157 [00:39<00:11,  3.16it/s]

Validating... (loss=2.04538):  77%|| 121/157 [00:39<00:11,  3.16it/s]

Validating... (loss=2.04538):  78%|| 122/157 [00:39<00:11,  3.16it/s]

Validating... (loss=2.00132):  78%|| 122/157 [00:39<00:11,  3.16it/s]

Validating... (loss=2.00132):  78%|| 123/157 [00:39<00:10,  3.16it/s]

Validating... (loss=1.87049):  78%|| 123/157 [00:40<00:10,  3.16it/s]

Validating... (loss=1.87049):  79%|| 124/157 [00:40<00:10,  3.16it/s]

Validating... (loss=2.16902):  79%|| 124/157 [00:40<00:10,  3.16it/s]

Validating... (loss=2.16902):  80%|| 125/157 [00:40<00:10,  3.16it/s]

Validating... (loss=1.81738):  80%|| 125/157 [00:40<00:10,  3.16it/s]

Validating... (loss=1.81738):  80%|| 126/157 [00:40<00:09,  3.16it/s]

Validating... (loss=1.99091):  80%|| 126/157 [00:41<00:09,  3.16it/s]

Validating... (loss=1.99091):  81%|| 127/157 [00:41<00:09,  3.16it/s]

Validating... (loss=1.95083):  81%|| 127/157 [00:41<00:09,  3.16it/s]

Validating... (loss=1.95083):  82%|| 128/157 [00:41<00:09,  3.16it/s]

Validating... (loss=1.98130):  82%|| 128/157 [00:41<00:09,  3.16it/s]

Validating... (loss=1.98130):  82%|| 129/157 [00:41<00:08,  3.15it/s]

Validating... (loss=2.19394):  82%|| 129/157 [00:42<00:08,  3.15it/s]

Validating... (loss=2.19394):  83%|| 130/157 [00:42<00:08,  3.16it/s]

Validating... (loss=1.92842):  83%|| 130/157 [00:42<00:08,  3.16it/s]

Validating... (loss=1.92842):  83%|| 131/157 [00:42<00:08,  3.16it/s]

Validating... (loss=2.02916):  83%|| 131/157 [00:42<00:08,  3.16it/s]

Validating... (loss=2.02916):  84%|| 132/157 [00:42<00:07,  3.16it/s]

Validating... (loss=1.97989):  84%|| 132/157 [00:43<00:07,  3.16it/s]

Validating... (loss=1.97989):  85%|| 133/157 [00:43<00:07,  3.16it/s]

Validating... (loss=2.08818):  85%|| 133/157 [00:43<00:07,  3.16it/s]

Validating... (loss=2.08818):  85%|| 134/157 [00:43<00:07,  3.16it/s]

Validating... (loss=1.94929):  85%|| 134/157 [00:43<00:07,  3.16it/s]

Validating... (loss=1.94929):  86%|| 135/157 [00:43<00:07,  3.14it/s]

Validating... (loss=2.03514):  86%|| 135/157 [00:44<00:07,  3.14it/s]

Validating... (loss=2.03514):  87%|| 136/157 [00:44<00:06,  3.13it/s]

Validating... (loss=2.01864):  87%|| 136/157 [00:44<00:06,  3.13it/s]

Validating... (loss=2.01864):  87%|| 137/157 [00:44<00:06,  3.14it/s]

Validating... (loss=1.90065):  87%|| 137/157 [00:44<00:06,  3.14it/s]

Validating... (loss=1.90065):  88%|| 138/157 [00:44<00:06,  3.14it/s]

Validating... (loss=1.97201):  88%|| 138/157 [00:44<00:06,  3.14it/s]

Validating... (loss=1.97201):  89%|| 139/157 [00:44<00:05,  3.15it/s]

Validating... (loss=1.79384):  89%|| 139/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.79384):  89%|| 140/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.97570):  89%|| 140/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.97570):  90%|| 141/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.93100):  90%|| 141/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.93100):  90%|| 142/157 [00:45<00:04,  3.15it/s]

Validating... (loss=2.00831):  90%|| 142/157 [00:46<00:04,  3.15it/s]

Validating... (loss=2.00831):  91%|| 143/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.85326):  91%|| 143/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.85326):  92%|| 144/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.80779):  92%|| 144/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.80779):  92%|| 145/157 [00:46<00:03,  3.15it/s]

Validating... (loss=2.15193):  92%|| 145/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.15193):  93%|| 146/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.23500):  93%|| 146/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.23500):  94%|| 147/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.00545):  94%|| 147/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.00545):  94%|| 148/157 [00:47<00:02,  3.15it/s]

Validating... (loss=1.94231):  94%|| 148/157 [00:48<00:02,  3.15it/s]

Validating... (loss=1.94231):  95%|| 149/157 [00:48<00:02,  3.15it/s]

Validating... (loss=1.92918):  95%|| 149/157 [00:48<00:02,  3.15it/s]

Validating... (loss=1.92918):  96%|| 150/157 [00:48<00:02,  3.16it/s]

Validating... (loss=1.91902):  96%|| 150/157 [00:48<00:02,  3.16it/s]

Validating... (loss=1.91902):  96%|| 151/157 [00:48<00:01,  3.15it/s]

Validating... (loss=1.94229):  96%|| 151/157 [00:49<00:01,  3.15it/s]

Validating... (loss=1.94229):  97%|| 152/157 [00:49<00:01,  3.15it/s]

Validating... (loss=1.96101):  97%|| 152/157 [00:49<00:01,  3.15it/s]

Validating... (loss=1.96101):  97%|| 153/157 [00:49<00:01,  3.16it/s]

Validating... (loss=1.94278):  97%|| 153/157 [00:49<00:01,  3.16it/s]

Validating... (loss=1.94278):  98%|| 154/157 [00:49<00:00,  3.16it/s]

Validating... (loss=2.05432):  98%|| 154/157 [00:50<00:00,  3.16it/s]

Validating... (loss=2.05432):  99%|| 155/157 [00:50<00:00,  3.16it/s]

Validating... (loss=1.92952):  99%|| 155/157 [00:50<00:00,  3.16it/s]

Validating... (loss=1.92952):  99%|| 156/157 [00:50<00:00,  3.16it/s]

Validating... (loss=1.85638):  99%|| 156/157 [00:50<00:00,  3.16it/s]

Validating... (loss=1.85638): 100%|| 157/157 [00:50<00:00,  3.93it/s]
Validating... (loss=1.85638): 100%|| 157/157 [00:50<00:00,  3.10it/s]
10/17/2022 06:22:45 - INFO - __main__ - 

10/17/2022 06:22:45 - INFO - __main__ - Validation Results
10/17/2022 06:22:45 - INFO - __main__ - Global Steps: 100
10/17/2022 06:22:45 - INFO - __main__ - Valid Loss: 1.98221
10/17/2022 06:22:45 - INFO - __main__ - Valid Accuracy: 0.28280
10/17/2022 06:22:46 - INFO - __main__ - Saved model checkpoint to [DIR: output]

Training (100 / 500 Steps) (loss=1.89570):  13%|| 100/782 [02:46<3:06:48, 16.43s/it]
Training (101 / 500 Steps) (loss=1.86026):  13%|| 100/782 [02:48<3:06:48, 16.43s/it]
Training (101 / 500 Steps) (loss=1.86026):  13%|| 101/782 [02:48<2:14:17, 11.83s/it]
Training (102 / 500 Steps) (loss=2.08292):  13%|| 101/782 [02:49<2:14:17, 11.83s/it]
Training (102 / 500 Steps) (loss=2.08292):  13%|| 102/782 [02:49<1:37:27,  8.60s/it]
Training (103 / 500 Steps) (loss=1.99103):  13%|| 102/782 [02:50<1:37:27,  8.60s/it]
Training (103 / 500 Steps) (loss=1.99103):  13%|| 103/782 [02:50<1:11:42,  6.34s/it]
Training (104 / 500 Steps) (loss=2.07994):  13%|| 103/782 [02:51<1:11:42,  6.34s/it]
Training (104 / 500 Steps) (loss=2.07994):  13%|| 104/782 [02:51<53:41,  4.75s/it]  
Training (105 / 500 Steps) (loss=1.98582):  13%|| 104/782 [02:52<53:41,  4.75s/it]
Training (105 / 500 Steps) (loss=1.98582):  13%|| 105/782 [02:52<41:06,  3.64s/it]
Training (106 / 500 Steps) (loss=2.22999):  13%|| 105/782 [02:53<41:06,  3.64s/it]
Training (106 / 500 Steps) (loss=2.22999):  14%|| 106/782 [02:53<32:18,  2.87s/it]
Training (107 / 500 Steps) (loss=2.05380):  14%|| 106/782 [02:54<32:18,  2.87s/it]
Training (107 / 500 Steps) (loss=2.05380):  14%|| 107/782 [02:54<26:09,  2.32s/it]
Training (108 / 500 Steps) (loss=2.07799):  14%|| 107/782 [02:55<26:09,  2.32s/it]
Training (108 / 500 Steps) (loss=2.07799):  14%|| 108/782 [02:55<21:52,  1.95s/it]
Training (109 / 500 Steps) (loss=2.00002):  14%|| 108/782 [02:56<21:52,  1.95s/it]
Training (109 / 500 Steps) (loss=2.00002):  14%|| 109/782 [02:56<18:51,  1.68s/it]
Training (110 / 500 Steps) (loss=2.07671):  14%|| 109/782 [02:57<18:51,  1.68s/it]
Training (110 / 500 Steps) (loss=2.07671):  14%|| 110/782 [02:57<16:42,  1.49s/it]
Training (111 / 500 Steps) (loss=1.98273):  14%|| 110/782 [02:58<16:42,  1.49s/it]
Training (111 / 500 Steps) (loss=1.98273):  14%|| 111/782 [02:58<15:14,  1.36s/it]
Training (112 / 500 Steps) (loss=1.99648):  14%|| 111/782 [02:59<15:14,  1.36s/it]
Training (112 / 500 Steps) (loss=1.99648):  14%|| 112/782 [02:59<14:10,  1.27s/it]
Training (113 / 500 Steps) (loss=1.95923):  14%|| 112/782 [03:00<14:10,  1.27s/it]
Training (113 / 500 Steps) (loss=1.95923):  14%|| 113/782 [03:00<13:26,  1.20s/it]
Training (114 / 500 Steps) (loss=2.18980):  14%|| 113/782 [03:01<13:26,  1.20s/it]
Training (114 / 500 Steps) (loss=2.18980):  15%|| 114/782 [03:01<12:54,  1.16s/it]
Training (115 / 500 Steps) (loss=2.07821):  15%|| 114/782 [03:02<12:54,  1.16s/it]
Training (115 / 500 Steps) (loss=2.07821):  15%|| 115/782 [03:02<12:33,  1.13s/it]
Training (116 / 500 Steps) (loss=2.07957):  15%|| 115/782 [03:03<12:33,  1.13s/it]
Training (116 / 500 Steps) (loss=2.07957):  15%|| 116/782 [03:03<12:17,  1.11s/it]
Training (117 / 500 Steps) (loss=2.14543):  15%|| 116/782 [03:04<12:17,  1.11s/it]
Training (117 / 500 Steps) (loss=2.14543):  15%|| 117/782 [03:04<12:05,  1.09s/it]
Training (118 / 500 Steps) (loss=2.02216):  15%|| 117/782 [03:05<12:05,  1.09s/it]
Training (118 / 500 Steps) (loss=2.02216):  15%|| 118/782 [03:05<11:57,  1.08s/it]
Training (119 / 500 Steps) (loss=1.99316):  15%|| 118/782 [03:07<11:57,  1.08s/it]
Training (119 / 500 Steps) (loss=1.99316):  15%|| 119/782 [03:07<11:51,  1.07s/it]
Training (120 / 500 Steps) (loss=2.16879):  15%|| 119/782 [03:08<11:51,  1.07s/it]
Training (120 / 500 Steps) (loss=2.16879):  15%|| 120/782 [03:08<11:46,  1.07s/it]
Training (121 / 500 Steps) (loss=2.22695):  15%|| 120/782 [03:09<11:46,  1.07s/it]
Training (121 / 500 Steps) (loss=2.22695):  15%|| 121/782 [03:09<11:44,  1.07s/it]
Training (122 / 500 Steps) (loss=1.95053):  15%|| 121/782 [03:10<11:44,  1.07s/it]
Training (122 / 500 Steps) (loss=1.95053):  16%|| 122/782 [03:10<11:40,  1.06s/it]
Training (123 / 500 Steps) (loss=2.20006):  16%|| 122/782 [03:11<11:40,  1.06s/it]
Training (123 / 500 Steps) (loss=2.20006):  16%|| 123/782 [03:11<11:38,  1.06s/it]
Training (124 / 500 Steps) (loss=2.01694):  16%|| 123/782 [03:12<11:38,  1.06s/it]
Training (124 / 500 Steps) (loss=2.01694):  16%|| 124/782 [03:12<11:36,  1.06s/it]
Training (125 / 500 Steps) (loss=1.96368):  16%|| 124/782 [03:13<11:36,  1.06s/it]
Training (125 / 500 Steps) (loss=1.96368):  16%|| 125/782 [03:13<11:36,  1.06s/it]
Training (126 / 500 Steps) (loss=2.07472):  16%|| 125/782 [03:14<11:36,  1.06s/it]
Training (126 / 500 Steps) (loss=2.07472):  16%|| 126/782 [03:14<11:33,  1.06s/it]
Training (127 / 500 Steps) (loss=2.09058):  16%|| 126/782 [03:15<11:33,  1.06s/it]
Training (127 / 500 Steps) (loss=2.09058):  16%|| 127/782 [03:15<11:31,  1.06s/it]
Training (128 / 500 Steps) (loss=2.13100):  16%|| 127/782 [03:16<11:31,  1.06s/it]
Training (128 / 500 Steps) (loss=2.13100):  16%|| 128/782 [03:16<11:31,  1.06s/it]
Training (129 / 500 Steps) (loss=1.86786):  16%|| 128/782 [03:17<11:31,  1.06s/it]
Training (129 / 500 Steps) (loss=1.86786):  16%|| 129/782 [03:17<11:29,  1.06s/it]
Training (130 / 500 Steps) (loss=2.10620):  16%|| 129/782 [03:18<11:29,  1.06s/it]
Training (130 / 500 Steps) (loss=2.10620):  17%|| 130/782 [03:18<11:29,  1.06s/it]
Training (131 / 500 Steps) (loss=1.88878):  17%|| 130/782 [03:19<11:29,  1.06s/it]
Training (131 / 500 Steps) (loss=1.88878):  17%|| 131/782 [03:19<11:27,  1.06s/it]
Training (132 / 500 Steps) (loss=2.00311):  17%|| 131/782 [03:20<11:27,  1.06s/it]
Training (132 / 500 Steps) (loss=2.00311):  17%|| 132/782 [03:20<11:25,  1.05s/it]
Training (133 / 500 Steps) (loss=1.99770):  17%|| 132/782 [03:21<11:25,  1.05s/it]
Training (133 / 500 Steps) (loss=1.99770):  17%|| 133/782 [03:21<11:24,  1.05s/it]
Training (134 / 500 Steps) (loss=2.11089):  17%|| 133/782 [03:22<11:24,  1.05s/it]
Training (134 / 500 Steps) (loss=2.11089):  17%|| 134/782 [03:22<11:23,  1.05s/it]
Training (135 / 500 Steps) (loss=1.82589):  17%|| 134/782 [03:23<11:23,  1.05s/it]
Training (135 / 500 Steps) (loss=1.82589):  17%|| 135/782 [03:23<11:21,  1.05s/it]
Training (136 / 500 Steps) (loss=1.96984):  17%|| 135/782 [03:24<11:21,  1.05s/it]
Training (136 / 500 Steps) (loss=1.96984):  17%|| 136/782 [03:24<11:20,  1.05s/it]
Training (137 / 500 Steps) (loss=2.02667):  17%|| 136/782 [03:26<11:20,  1.05s/it]
Training (137 / 500 Steps) (loss=2.02667):  18%|| 137/782 [03:26<11:19,  1.05s/it]
Training (138 / 500 Steps) (loss=2.05818):  18%|| 137/782 [03:27<11:19,  1.05s/it]
Training (138 / 500 Steps) (loss=2.05818):  18%|| 138/782 [03:27<11:18,  1.05s/it]
Training (139 / 500 Steps) (loss=1.91815):  18%|| 138/782 [03:28<11:18,  1.05s/it]
Training (139 / 500 Steps) (loss=1.91815):  18%|| 139/782 [03:28<11:17,  1.05s/it]
Training (140 / 500 Steps) (loss=1.88530):  18%|| 139/782 [03:29<11:17,  1.05s/it]
Training (140 / 500 Steps) (loss=1.88530):  18%|| 140/782 [03:29<11:16,  1.05s/it]
Training (141 / 500 Steps) (loss=2.05459):  18%|| 140/782 [03:30<11:16,  1.05s/it]
Training (141 / 500 Steps) (loss=2.05459):  18%|| 141/782 [03:30<11:15,  1.05s/it]
Training (142 / 500 Steps) (loss=2.02709):  18%|| 141/782 [03:31<11:15,  1.05s/it]
Training (142 / 500 Steps) (loss=2.02709):  18%|| 142/782 [03:31<11:16,  1.06s/it]
Training (143 / 500 Steps) (loss=1.92819):  18%|| 142/782 [03:32<11:16,  1.06s/it]
Training (143 / 500 Steps) (loss=1.92819):  18%|| 143/782 [03:32<11:14,  1.06s/it]
Training (144 / 500 Steps) (loss=2.05687):  18%|| 143/782 [03:33<11:14,  1.06s/it]
Training (144 / 500 Steps) (loss=2.05687):  18%|| 144/782 [03:33<11:13,  1.06s/it]
Training (145 / 500 Steps) (loss=1.86914):  18%|| 144/782 [03:34<11:13,  1.06s/it]
Training (145 / 500 Steps) (loss=1.86914):  19%|| 145/782 [03:34<11:12,  1.06s/it]
Training (146 / 500 Steps) (loss=2.05172):  19%|| 145/782 [03:35<11:12,  1.06s/it]
Training (146 / 500 Steps) (loss=2.05172):  19%|| 146/782 [03:35<11:11,  1.06s/it]
Training (147 / 500 Steps) (loss=2.28626):  19%|| 146/782 [03:36<11:11,  1.06s/it]
Training (147 / 500 Steps) (loss=2.28626):  19%|| 147/782 [03:36<11:10,  1.06s/it]
Training (148 / 500 Steps) (loss=1.91951):  19%|| 147/782 [03:37<11:10,  1.06s/it]
Training (148 / 500 Steps) (loss=1.91951):  19%|| 148/782 [03:37<11:09,  1.06s/it]
Training (149 / 500 Steps) (loss=1.85589):  19%|| 148/782 [03:38<11:09,  1.06s/it]
Training (149 / 500 Steps) (loss=1.85589):  19%|| 149/782 [03:38<11:08,  1.06s/it]
Training (150 / 500 Steps) (loss=2.08412):  19%|| 149/782 [03:39<11:08,  1.06s/it]
Training (150 / 500 Steps) (loss=2.08412):  19%|| 150/782 [03:39<11:07,  1.06s/it]
Training (151 / 500 Steps) (loss=2.12710):  19%|| 150/782 [03:40<11:07,  1.06s/it]
Training (151 / 500 Steps) (loss=2.12710):  19%|| 151/782 [03:40<11:05,  1.05s/it]
Training (152 / 500 Steps) (loss=2.06018):  19%|| 151/782 [03:41<11:05,  1.05s/it]
Training (152 / 500 Steps) (loss=2.06018):  19%|| 152/782 [03:41<11:04,  1.06s/it]
Training (153 / 500 Steps) (loss=1.99174):  19%|| 152/782 [03:42<11:04,  1.06s/it]
Training (153 / 500 Steps) (loss=1.99174):  20%|| 153/782 [03:42<11:06,  1.06s/it]
Training (154 / 500 Steps) (loss=1.97126):  20%|| 153/782 [03:43<11:06,  1.06s/it]
Training (154 / 500 Steps) (loss=1.97126):  20%|| 154/782 [03:43<11:05,  1.06s/it]
Training (155 / 500 Steps) (loss=2.00527):  20%|| 154/782 [03:45<11:05,  1.06s/it]
Training (155 / 500 Steps) (loss=2.00527):  20%|| 155/782 [03:45<11:04,  1.06s/it]
Training (156 / 500 Steps) (loss=1.98465):  20%|| 155/782 [03:46<11:04,  1.06s/it]
Training (156 / 500 Steps) (loss=1.98465):  20%|| 156/782 [03:46<11:03,  1.06s/it]
Training (157 / 500 Steps) (loss=1.86657):  20%|| 156/782 [03:47<11:03,  1.06s/it]
Training (157 / 500 Steps) (loss=1.86657):  20%|| 157/782 [03:47<11:03,  1.06s/it]
Training (158 / 500 Steps) (loss=2.06078):  20%|| 157/782 [03:48<11:03,  1.06s/it]
Training (158 / 500 Steps) (loss=2.06078):  20%|| 158/782 [03:48<11:07,  1.07s/it]
Training (159 / 500 Steps) (loss=2.01372):  20%|| 158/782 [03:49<11:07,  1.07s/it]
Training (159 / 500 Steps) (loss=2.01372):  20%|| 159/782 [03:49<11:07,  1.07s/it]
Training (160 / 500 Steps) (loss=2.08139):  20%|| 159/782 [03:50<11:07,  1.07s/it]
Training (160 / 500 Steps) (loss=2.08139):  20%|| 160/782 [03:50<11:06,  1.07s/it]
Training (161 / 500 Steps) (loss=2.12506):  20%|| 160/782 [03:51<11:06,  1.07s/it]
Training (161 / 500 Steps) (loss=2.12506):  21%|| 161/782 [03:51<11:02,  1.07s/it]
Training (162 / 500 Steps) (loss=1.94751):  21%|| 161/782 [03:52<11:02,  1.07s/it]
Training (162 / 500 Steps) (loss=1.94751):  21%|| 162/782 [03:52<10:59,  1.06s/it]
Training (163 / 500 Steps) (loss=1.98456):  21%|| 162/782 [03:53<10:59,  1.06s/it]
Training (163 / 500 Steps) (loss=1.98456):  21%|| 163/782 [03:53<10:57,  1.06s/it]
Training (164 / 500 Steps) (loss=1.90357):  21%|| 163/782 [03:54<10:57,  1.06s/it]
Training (164 / 500 Steps) (loss=1.90357):  21%|| 164/782 [03:54<10:55,  1.06s/it]
Training (165 / 500 Steps) (loss=1.92317):  21%|| 164/782 [03:55<10:55,  1.06s/it]
Training (165 / 500 Steps) (loss=1.92317):  21%|| 165/782 [03:55<10:54,  1.06s/it]
Training (166 / 500 Steps) (loss=2.02720):  21%|| 165/782 [03:56<10:54,  1.06s/it]
Training (166 / 500 Steps) (loss=2.02720):  21%|| 166/782 [03:56<10:52,  1.06s/it]
Training (167 / 500 Steps) (loss=1.97539):  21%|| 166/782 [03:57<10:52,  1.06s/it]
Training (167 / 500 Steps) (loss=1.97539):  21%|| 167/782 [03:57<10:51,  1.06s/it]
Training (168 / 500 Steps) (loss=1.79644):  21%|| 167/782 [03:58<10:51,  1.06s/it]
Training (168 / 500 Steps) (loss=1.79644):  21%|| 168/782 [03:58<10:50,  1.06s/it]
Training (169 / 500 Steps) (loss=2.04522):  21%|| 168/782 [03:59<10:50,  1.06s/it]
Training (169 / 500 Steps) (loss=2.04522):  22%|| 169/782 [03:59<10:49,  1.06s/it]
Training (170 / 500 Steps) (loss=2.04152):  22%|| 169/782 [04:00<10:49,  1.06s/it]
Training (170 / 500 Steps) (loss=2.04152):  22%|| 170/782 [04:00<10:47,  1.06s/it]
Training (171 / 500 Steps) (loss=2.06425):  22%|| 170/782 [04:02<10:47,  1.06s/it]
Training (171 / 500 Steps) (loss=2.06425):  22%|| 171/782 [04:02<10:45,  1.06s/it]
Training (172 / 500 Steps) (loss=1.97787):  22%|| 171/782 [04:03<10:45,  1.06s/it]
Training (172 / 500 Steps) (loss=1.97787):  22%|| 172/782 [04:03<10:45,  1.06s/it]
Training (173 / 500 Steps) (loss=1.94621):  22%|| 172/782 [04:04<10:45,  1.06s/it]
Training (173 / 500 Steps) (loss=1.94621):  22%|| 173/782 [04:04<10:45,  1.06s/it]
Training (174 / 500 Steps) (loss=2.04405):  22%|| 173/782 [04:05<10:45,  1.06s/it]
Training (174 / 500 Steps) (loss=2.04405):  22%|| 174/782 [04:05<10:45,  1.06s/it]
Training (175 / 500 Steps) (loss=2.15595):  22%|| 174/782 [04:06<10:45,  1.06s/it]
Training (175 / 500 Steps) (loss=2.15595):  22%|| 175/782 [04:06<10:44,  1.06s/it]
Training (176 / 500 Steps) (loss=1.77521):  22%|| 175/782 [04:07<10:44,  1.06s/it]
Training (176 / 500 Steps) (loss=1.77521):  23%|| 176/782 [04:07<10:43,  1.06s/it]
Training (177 / 500 Steps) (loss=1.96979):  23%|| 176/782 [04:08<10:43,  1.06s/it]
Training (177 / 500 Steps) (loss=1.96979):  23%|| 177/782 [04:08<10:43,  1.06s/it]
Training (178 / 500 Steps) (loss=2.10802):  23%|| 177/782 [04:09<10:43,  1.06s/it]
Training (178 / 500 Steps) (loss=2.10802):  23%|| 178/782 [04:09<10:41,  1.06s/it]
Training (179 / 500 Steps) (loss=2.17881):  23%|| 178/782 [04:10<10:41,  1.06s/it]
Training (179 / 500 Steps) (loss=2.17881):  23%|| 179/782 [04:10<10:39,  1.06s/it]
Training (180 / 500 Steps) (loss=1.84575):  23%|| 179/782 [04:11<10:39,  1.06s/it]
Training (180 / 500 Steps) (loss=1.84575):  23%|| 180/782 [04:11<10:39,  1.06s/it]
Training (181 / 500 Steps) (loss=2.08683):  23%|| 180/782 [04:12<10:39,  1.06s/it]
Training (181 / 500 Steps) (loss=2.08683):  23%|| 181/782 [04:12<10:37,  1.06s/it]
Training (182 / 500 Steps) (loss=1.96619):  23%|| 181/782 [04:13<10:37,  1.06s/it]
Training (182 / 500 Steps) (loss=1.96619):  23%|| 182/782 [04:13<10:35,  1.06s/it]
Training (183 / 500 Steps) (loss=2.16530):  23%|| 182/782 [04:14<10:35,  1.06s/it]
Training (183 / 500 Steps) (loss=2.16530):  23%|| 183/782 [04:14<10:34,  1.06s/it]
Training (184 / 500 Steps) (loss=2.04838):  23%|| 183/782 [04:15<10:34,  1.06s/it]
Training (184 / 500 Steps) (loss=2.04838):  24%|| 184/782 [04:15<10:33,  1.06s/it]
Training (185 / 500 Steps) (loss=1.93895):  24%|| 184/782 [04:16<10:33,  1.06s/it]
Training (185 / 500 Steps) (loss=1.93895):  24%|| 185/782 [04:16<10:31,  1.06s/it]
Training (186 / 500 Steps) (loss=2.14315):  24%|| 185/782 [04:17<10:31,  1.06s/it]
Training (186 / 500 Steps) (loss=2.14315):  24%|| 186/782 [04:17<10:31,  1.06s/it]
Training (187 / 500 Steps) (loss=1.86138):  24%|| 186/782 [04:19<10:31,  1.06s/it]
Training (187 / 500 Steps) (loss=1.86138):  24%|| 187/782 [04:19<10:29,  1.06s/it]
Training (188 / 500 Steps) (loss=1.98093):  24%|| 187/782 [04:20<10:29,  1.06s/it]
Training (188 / 500 Steps) (loss=1.98093):  24%|| 188/782 [04:20<10:27,  1.06s/it]
Training (189 / 500 Steps) (loss=2.12659):  24%|| 188/782 [04:21<10:27,  1.06s/it]
Training (189 / 500 Steps) (loss=2.12659):  24%|| 189/782 [04:21<10:28,  1.06s/it]
Training (190 / 500 Steps) (loss=1.93118):  24%|| 189/782 [04:22<10:28,  1.06s/it]
Training (190 / 500 Steps) (loss=1.93118):  24%|| 190/782 [04:22<10:27,  1.06s/it]
Training (191 / 500 Steps) (loss=2.04132):  24%|| 190/782 [04:23<10:27,  1.06s/it]
Training (191 / 500 Steps) (loss=2.04132):  24%|| 191/782 [04:23<10:26,  1.06s/it]
Training (192 / 500 Steps) (loss=2.08756):  24%|| 191/782 [04:24<10:26,  1.06s/it]
Training (192 / 500 Steps) (loss=2.08756):  25%|| 192/782 [04:24<10:25,  1.06s/it]
Training (193 / 500 Steps) (loss=2.08309):  25%|| 192/782 [04:25<10:25,  1.06s/it]
Training (193 / 500 Steps) (loss=2.08309):  25%|| 193/782 [04:25<10:23,  1.06s/it]
Training (194 / 500 Steps) (loss=2.08341):  25%|| 193/782 [04:26<10:23,  1.06s/it]
Training (194 / 500 Steps) (loss=2.08341):  25%|| 194/782 [04:26<10:21,  1.06s/it]
Training (195 / 500 Steps) (loss=2.06028):  25%|| 194/782 [04:27<10:21,  1.06s/it]
Training (195 / 500 Steps) (loss=2.06028):  25%|| 195/782 [04:27<10:20,  1.06s/it]
Training (196 / 500 Steps) (loss=2.03513):  25%|| 195/782 [04:28<10:20,  1.06s/it]
Training (196 / 500 Steps) (loss=2.03513):  25%|| 196/782 [04:28<10:19,  1.06s/it]
Training (197 / 500 Steps) (loss=1.89645):  25%|| 196/782 [04:29<10:19,  1.06s/it]
Training (197 / 500 Steps) (loss=1.89645):  25%|| 197/782 [04:29<10:17,  1.06s/it]
Training (198 / 500 Steps) (loss=2.10809):  25%|| 197/782 [04:30<10:17,  1.06s/it]
Training (198 / 500 Steps) (loss=2.10809):  25%|| 198/782 [04:30<10:15,  1.05s/it]
Training (199 / 500 Steps) (loss=1.88052):  25%|| 198/782 [04:31<10:15,  1.05s/it]
Training (199 / 500 Steps) (loss=1.88052):  25%|| 199/782 [04:31<10:14,  1.05s/it]
Training (200 / 500 Steps) (loss=2.05008):  25%|| 199/782 [04:32<10:14,  1.05s/it]10/17/2022 06:24:32 - INFO - __main__ - ***** Running Validation *****
10/17/2022 06:24:32 - INFO - __main__ -   Num steps = 157
10/17/2022 06:24:32 - INFO - __main__ -   Batch size = 64


Validating... (loss=X.X):   0%|| 0/157 [00:00<?, ?it/s]

Validating... (loss=1.86944):   0%|| 0/157 [00:01<?, ?it/s]

Validating... (loss=1.86944):   1%|| 1/157 [00:01<02:59,  1.15s/it]

Validating... (loss=2.03485):   1%|| 1/157 [00:01<02:59,  1.15s/it]

Validating... (loss=2.03485):   1%|| 2/157 [00:01<01:42,  1.51it/s]

Validating... (loss=2.07585):   1%|| 2/157 [00:01<01:42,  1.51it/s]

Validating... (loss=2.07585):   2%|| 3/157 [00:01<01:17,  1.99it/s]

Validating... (loss=1.98396):   2%|| 3/157 [00:02<01:17,  1.99it/s]

Validating... (loss=1.98396):   3%|| 4/157 [00:02<01:05,  2.33it/s]

Validating... (loss=1.94871):   3%|| 4/157 [00:02<01:05,  2.33it/s]

Validating... (loss=1.94871):   3%|| 5/157 [00:02<00:59,  2.57it/s]

Validating... (loss=2.13938):   3%|| 5/157 [00:02<00:59,  2.57it/s]

Validating... (loss=2.13938):   4%|| 6/157 [00:02<00:55,  2.74it/s]

Validating... (loss=2.00701):   4%|| 6/157 [00:03<00:55,  2.74it/s]

Validating... (loss=2.00701):   4%|| 7/157 [00:03<00:52,  2.87it/s]

Validating... (loss=1.81772):   4%|| 7/157 [00:03<00:52,  2.87it/s]

Validating... (loss=1.81772):   5%|| 8/157 [00:03<00:50,  2.95it/s]

Validating... (loss=1.97038):   5%|| 8/157 [00:03<00:50,  2.95it/s]

Validating... (loss=1.97038):   6%|| 9/157 [00:03<00:49,  3.01it/s]

Validating... (loss=1.96803):   6%|| 9/157 [00:04<00:49,  3.01it/s]

Validating... (loss=1.96803):   6%|| 10/157 [00:04<00:48,  3.05it/s]

Validating... (loss=2.19350):   6%|| 10/157 [00:04<00:48,  3.05it/s]

Validating... (loss=2.19350):   7%|| 11/157 [00:04<00:47,  3.08it/s]

Validating... (loss=2.03261):   7%|| 11/157 [00:04<00:47,  3.08it/s]

Validating... (loss=2.03261):   8%|| 12/157 [00:04<00:46,  3.10it/s]

Validating... (loss=2.05421):   8%|| 12/157 [00:04<00:46,  3.10it/s]

Validating... (loss=2.05421):   8%|| 13/157 [00:04<00:46,  3.12it/s]

Validating... (loss=1.88471):   8%|| 13/157 [00:05<00:46,  3.12it/s]

Validating... (loss=1.88471):   9%|| 14/157 [00:05<00:45,  3.13it/s]

Validating... (loss=1.99959):   9%|| 14/157 [00:05<00:45,  3.13it/s]

Validating... (loss=1.99959):  10%|| 15/157 [00:05<00:45,  3.14it/s]

Validating... (loss=1.94571):  10%|| 15/157 [00:05<00:45,  3.14it/s]

Validating... (loss=1.94571):  10%|| 16/157 [00:05<00:44,  3.14it/s]

Validating... (loss=2.03189):  10%|| 16/157 [00:06<00:44,  3.14it/s]

Validating... (loss=2.03189):  11%|| 17/157 [00:06<00:44,  3.14it/s]

Validating... (loss=1.97223):  11%|| 17/157 [00:06<00:44,  3.14it/s]

Validating... (loss=1.97223):  11%|| 18/157 [00:06<00:44,  3.14it/s]

Validating... (loss=1.94048):  11%|| 18/157 [00:06<00:44,  3.14it/s]

Validating... (loss=1.94048):  12%|| 19/157 [00:06<00:44,  3.12it/s]

Validating... (loss=2.11086):  12%|| 19/157 [00:07<00:44,  3.12it/s]

Validating... (loss=2.11086):  13%|| 20/157 [00:07<00:43,  3.12it/s]

Validating... (loss=1.97585):  13%|| 20/157 [00:07<00:43,  3.12it/s]

Validating... (loss=1.97585):  13%|| 21/157 [00:07<00:43,  3.13it/s]

Validating... (loss=1.98398):  13%|| 21/157 [00:07<00:43,  3.13it/s]

Validating... (loss=1.98398):  14%|| 22/157 [00:07<00:42,  3.14it/s]

Validating... (loss=1.79474):  14%|| 22/157 [00:08<00:42,  3.14it/s]

Validating... (loss=1.79474):  15%|| 23/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.87251):  15%|| 23/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.87251):  15%|| 24/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.91775):  15%|| 24/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.91775):  16%|| 25/157 [00:08<00:42,  3.14it/s]

Validating... (loss=1.96551):  16%|| 25/157 [00:09<00:42,  3.14it/s]

Validating... (loss=1.96551):  17%|| 26/157 [00:09<00:41,  3.14it/s]

Validating... (loss=1.90805):  17%|| 26/157 [00:09<00:41,  3.14it/s]

Validating... (loss=1.90805):  17%|| 27/157 [00:09<00:41,  3.14it/s]

Validating... (loss=1.78223):  17%|| 27/157 [00:09<00:41,  3.14it/s]

Validating... (loss=1.78223):  18%|| 28/157 [00:09<00:41,  3.15it/s]

Validating... (loss=2.07735):  18%|| 28/157 [00:10<00:41,  3.15it/s]

Validating... (loss=2.07735):  18%|| 29/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.99747):  18%|| 29/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.99747):  19%|| 30/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.95838):  19%|| 30/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.95838):  20%|| 31/157 [00:10<00:40,  3.14it/s]

Validating... (loss=1.90873):  20%|| 31/157 [00:11<00:40,  3.14it/s]

Validating... (loss=1.90873):  20%|| 32/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.93684):  20%|| 32/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.93684):  21%|| 33/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.94780):  21%|| 33/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.94780):  22%|| 34/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.91293):  22%|| 34/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.91293):  22%|| 35/157 [00:11<00:38,  3.15it/s]

Validating... (loss=2.01004):  22%|| 35/157 [00:12<00:38,  3.15it/s]

Validating... (loss=2.01004):  23%|| 36/157 [00:12<00:38,  3.14it/s]

Validating... (loss=1.93956):  23%|| 36/157 [00:12<00:38,  3.14it/s]

Validating... (loss=1.93956):  24%|| 37/157 [00:12<00:38,  3.15it/s]

Validating... (loss=2.06394):  24%|| 37/157 [00:12<00:38,  3.15it/s]

Validating... (loss=2.06394):  24%|| 38/157 [00:12<00:37,  3.15it/s]

Validating... (loss=1.89266):  24%|| 38/157 [00:13<00:37,  3.15it/s]

Validating... (loss=1.89266):  25%|| 39/157 [00:13<00:37,  3.15it/s]

Validating... (loss=2.06991):  25%|| 39/157 [00:13<00:37,  3.15it/s]

Validating... (loss=2.06991):  25%|| 40/157 [00:13<00:37,  3.15it/s]

Validating... (loss=1.91859):  25%|| 40/157 [00:13<00:37,  3.15it/s]

Validating... (loss=1.91859):  26%|| 41/157 [00:13<00:36,  3.15it/s]

Validating... (loss=1.84972):  26%|| 41/157 [00:14<00:36,  3.15it/s]

Validating... (loss=1.84972):  27%|| 42/157 [00:14<00:36,  3.15it/s]

Validating... (loss=1.88879):  27%|| 42/157 [00:14<00:36,  3.15it/s]

Validating... (loss=1.88879):  27%|| 43/157 [00:14<00:36,  3.16it/s]

Validating... (loss=2.11289):  27%|| 43/157 [00:14<00:36,  3.16it/s]

Validating... (loss=2.11289):  28%|| 44/157 [00:14<00:35,  3.16it/s]

Validating... (loss=2.10568):  28%|| 44/157 [00:15<00:35,  3.16it/s]

Validating... (loss=2.10568):  29%|| 45/157 [00:15<00:35,  3.16it/s]

Validating... (loss=1.95147):  29%|| 45/157 [00:15<00:35,  3.16it/s]

Validating... (loss=1.95147):  29%|| 46/157 [00:15<00:35,  3.16it/s]

Validating... (loss=1.88425):  29%|| 46/157 [00:15<00:35,  3.16it/s]

Validating... (loss=1.88425):  30%|| 47/157 [00:15<00:34,  3.16it/s]

Validating... (loss=1.88215):  30%|| 47/157 [00:16<00:34,  3.16it/s]

Validating... (loss=1.88215):  31%|| 48/157 [00:16<00:34,  3.16it/s]

Validating... (loss=2.00863):  31%|| 48/157 [00:16<00:34,  3.16it/s]

Validating... (loss=2.00863):  31%|| 49/157 [00:16<00:34,  3.16it/s]

Validating... (loss=1.98821):  31%|| 49/157 [00:16<00:34,  3.16it/s]

Validating... (loss=1.98821):  32%|| 50/157 [00:16<00:33,  3.15it/s]

Validating... (loss=2.16413):  32%|| 50/157 [00:17<00:33,  3.15it/s]

Validating... (loss=2.16413):  32%|| 51/157 [00:17<00:33,  3.15it/s]

Validating... (loss=1.96137):  32%|| 51/157 [00:17<00:33,  3.15it/s]

Validating... (loss=1.96137):  33%|| 52/157 [00:17<00:33,  3.15it/s]

Validating... (loss=1.86000):  33%|| 52/157 [00:17<00:33,  3.15it/s]

Validating... (loss=1.86000):  34%|| 53/157 [00:17<00:33,  3.15it/s]

Validating... (loss=1.97848):  34%|| 53/157 [00:17<00:33,  3.15it/s]

Validating... (loss=1.97848):  34%|| 54/157 [00:17<00:32,  3.15it/s]

Validating... (loss=2.07472):  34%|| 54/157 [00:18<00:32,  3.15it/s]

Validating... (loss=2.07472):  35%|| 55/157 [00:18<00:32,  3.16it/s]

Validating... (loss=1.89011):  35%|| 55/157 [00:18<00:32,  3.16it/s]

Validating... (loss=1.89011):  36%|| 56/157 [00:18<00:32,  3.15it/s]

Validating... (loss=1.99740):  36%|| 56/157 [00:18<00:32,  3.15it/s]

Validating... (loss=1.99740):  36%|| 57/157 [00:18<00:31,  3.15it/s]

Validating... (loss=1.81359):  36%|| 57/157 [00:19<00:31,  3.15it/s]

Validating... (loss=1.81359):  37%|| 58/157 [00:19<00:31,  3.15it/s]

Validating... (loss=2.05534):  37%|| 58/157 [00:19<00:31,  3.15it/s]

Validating... (loss=2.05534):  38%|| 59/157 [00:19<00:31,  3.15it/s]

Validating... (loss=1.86828):  38%|| 59/157 [00:19<00:31,  3.15it/s]

Validating... (loss=1.86828):  38%|| 60/157 [00:19<00:30,  3.15it/s]

Validating... (loss=1.96234):  38%|| 60/157 [00:20<00:30,  3.15it/s]

Validating... (loss=1.96234):  39%|| 61/157 [00:20<00:30,  3.15it/s]

Validating... (loss=1.86146):  39%|| 61/157 [00:20<00:30,  3.15it/s]

Validating... (loss=1.86146):  39%|| 62/157 [00:20<00:30,  3.15it/s]

Validating... (loss=1.98090):  39%|| 62/157 [00:20<00:30,  3.15it/s]

Validating... (loss=1.98090):  40%|| 63/157 [00:20<00:29,  3.16it/s]

Validating... (loss=1.99005):  40%|| 63/157 [00:21<00:29,  3.16it/s]

Validating... (loss=1.99005):  41%|| 64/157 [00:21<00:29,  3.16it/s]

Validating... (loss=2.07057):  41%|| 64/157 [00:21<00:29,  3.16it/s]

Validating... (loss=2.07057):  41%|| 65/157 [00:21<00:29,  3.16it/s]

Validating... (loss=1.98762):  41%|| 65/157 [00:21<00:29,  3.16it/s]

Validating... (loss=1.98762):  42%|| 66/157 [00:21<00:28,  3.16it/s]

Validating... (loss=1.90119):  42%|| 66/157 [00:22<00:28,  3.16it/s]

Validating... (loss=1.90119):  43%|| 67/157 [00:22<00:28,  3.16it/s]

Validating... (loss=1.93364):  43%|| 67/157 [00:22<00:28,  3.16it/s]

Validating... (loss=1.93364):  43%|| 68/157 [00:22<00:28,  3.16it/s]

Validating... (loss=1.90612):  43%|| 68/157 [00:22<00:28,  3.16it/s]

Validating... (loss=1.90612):  44%|| 69/157 [00:22<00:27,  3.16it/s]

Validating... (loss=1.91130):  44%|| 69/157 [00:23<00:27,  3.16it/s]

Validating... (loss=1.91130):  45%|| 70/157 [00:23<00:27,  3.16it/s]

Validating... (loss=2.05313):  45%|| 70/157 [00:23<00:27,  3.16it/s]

Validating... (loss=2.05313):  45%|| 71/157 [00:23<00:27,  3.16it/s]

Validating... (loss=2.13242):  45%|| 71/157 [00:23<00:27,  3.16it/s]

Validating... (loss=2.13242):  46%|| 72/157 [00:23<00:26,  3.16it/s]

Validating... (loss=1.81916):  46%|| 72/157 [00:23<00:26,  3.16it/s]

Validating... (loss=1.81916):  46%|| 73/157 [00:23<00:26,  3.16it/s]

Validating... (loss=1.99917):  46%|| 73/157 [00:24<00:26,  3.16it/s]

Validating... (loss=1.99917):  47%|| 74/157 [00:24<00:26,  3.15it/s]

Validating... (loss=2.12505):  47%|| 74/157 [00:24<00:26,  3.15it/s]

Validating... (loss=2.12505):  48%|| 75/157 [00:24<00:26,  3.15it/s]

Validating... (loss=1.83083):  48%|| 75/157 [00:24<00:26,  3.15it/s]

Validating... (loss=1.83083):  48%|| 76/157 [00:24<00:25,  3.15it/s]

Validating... (loss=1.81326):  48%|| 76/157 [00:25<00:25,  3.15it/s]

Validating... (loss=1.81326):  49%|| 77/157 [00:25<00:25,  3.15it/s]

Validating... (loss=1.81749):  49%|| 77/157 [00:25<00:25,  3.15it/s]

Validating... (loss=1.81749):  50%|| 78/157 [00:25<00:25,  3.15it/s]

Validating... (loss=1.98629):  50%|| 78/157 [00:25<00:25,  3.15it/s]

Validating... (loss=1.98629):  50%|| 79/157 [00:25<00:24,  3.15it/s]

Validating... (loss=1.90219):  50%|| 79/157 [00:26<00:24,  3.15it/s]

Validating... (loss=1.90219):  51%|| 80/157 [00:26<00:24,  3.15it/s]

Validating... (loss=2.01912):  51%|| 80/157 [00:26<00:24,  3.15it/s]

Validating... (loss=2.01912):  52%|| 81/157 [00:26<00:24,  3.15it/s]

Validating... (loss=1.97943):  52%|| 81/157 [00:26<00:24,  3.15it/s]

Validating... (loss=1.97943):  52%|| 82/157 [00:26<00:23,  3.15it/s]

Validating... (loss=1.89908):  52%|| 82/157 [00:27<00:23,  3.15it/s]

Validating... (loss=1.89908):  53%|| 83/157 [00:27<00:23,  3.16it/s]

Validating... (loss=1.97435):  53%|| 83/157 [00:27<00:23,  3.16it/s]

Validating... (loss=1.97435):  54%|| 84/157 [00:27<00:23,  3.16it/s]

Validating... (loss=1.97532):  54%|| 84/157 [00:27<00:23,  3.16it/s]

Validating... (loss=1.97532):  54%|| 85/157 [00:27<00:22,  3.16it/s]

Validating... (loss=1.99493):  54%|| 85/157 [00:28<00:22,  3.16it/s]

Validating... (loss=1.99493):  55%|| 86/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.12785):  55%|| 86/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.12785):  55%|| 87/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.15505):  55%|| 87/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.15505):  56%|| 88/157 [00:28<00:21,  3.15it/s]

Validating... (loss=2.13224):  56%|| 88/157 [00:29<00:21,  3.15it/s]

Validating... (loss=2.13224):  57%|| 89/157 [00:29<00:21,  3.15it/s]

Validating... (loss=1.88718):  57%|| 89/157 [00:29<00:21,  3.15it/s]

Validating... (loss=1.88718):  57%|| 90/157 [00:29<00:21,  3.15it/s]

Validating... (loss=1.87169):  57%|| 90/157 [00:29<00:21,  3.15it/s]

Validating... (loss=1.87169):  58%|| 91/157 [00:29<00:20,  3.15it/s]

Validating... (loss=1.98527):  58%|| 91/157 [00:30<00:20,  3.15it/s]

Validating... (loss=1.98527):  59%|| 92/157 [00:30<00:20,  3.15it/s]

Validating... (loss=1.88963):  59%|| 92/157 [00:30<00:20,  3.15it/s]

Validating... (loss=1.88963):  59%|| 93/157 [00:30<00:20,  3.15it/s]

Validating... (loss=2.10224):  59%|| 93/157 [00:30<00:20,  3.15it/s]

Validating... (loss=2.10224):  60%|| 94/157 [00:30<00:19,  3.15it/s]

Validating... (loss=1.92989):  60%|| 94/157 [00:30<00:19,  3.15it/s]

Validating... (loss=1.92989):  61%|| 95/157 [00:30<00:19,  3.15it/s]

Validating... (loss=2.08764):  61%|| 95/157 [00:31<00:19,  3.15it/s]

Validating... (loss=2.08764):  61%|| 96/157 [00:31<00:19,  3.15it/s]

Validating... (loss=1.99775):  61%|| 96/157 [00:31<00:19,  3.15it/s]

Validating... (loss=1.99775):  62%|| 97/157 [00:31<00:19,  3.15it/s]

Validating... (loss=1.90177):  62%|| 97/157 [00:31<00:19,  3.15it/s]

Validating... (loss=1.90177):  62%|| 98/157 [00:31<00:18,  3.15it/s]

Validating... (loss=1.86669):  62%|| 98/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.86669):  63%|| 99/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.98455):  63%|| 99/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.98455):  64%|| 100/157 [00:32<00:18,  3.14it/s]

Validating... (loss=1.99722):  64%|| 100/157 [00:32<00:18,  3.14it/s]

Validating... (loss=1.99722):  64%|| 101/157 [00:32<00:17,  3.14it/s]

Validating... (loss=1.90499):  64%|| 101/157 [00:33<00:17,  3.14it/s]

Validating... (loss=1.90499):  65%|| 102/157 [00:33<00:17,  3.14it/s]

Validating... (loss=2.09689):  65%|| 102/157 [00:33<00:17,  3.14it/s]

Validating... (loss=2.09689):  66%|| 103/157 [00:33<00:17,  3.13it/s]

Validating... (loss=1.87516):  66%|| 103/157 [00:33<00:17,  3.13it/s]

Validating... (loss=1.87516):  66%|| 104/157 [00:33<00:16,  3.14it/s]

Validating... (loss=2.06371):  66%|| 104/157 [00:34<00:16,  3.14it/s]

Validating... (loss=2.06371):  67%|| 105/157 [00:34<00:16,  3.14it/s]

Validating... (loss=2.00616):  67%|| 105/157 [00:34<00:16,  3.14it/s]

Validating... (loss=2.00616):  68%|| 106/157 [00:34<00:16,  3.15it/s]

Validating... (loss=1.89104):  68%|| 106/157 [00:34<00:16,  3.15it/s]

Validating... (loss=1.89104):  68%|| 107/157 [00:34<00:15,  3.15it/s]

Validating... (loss=2.14815):  68%|| 107/157 [00:35<00:15,  3.15it/s]

Validating... (loss=2.14815):  69%|| 108/157 [00:35<00:15,  3.15it/s]

Validating... (loss=2.08395):  69%|| 108/157 [00:35<00:15,  3.15it/s]

Validating... (loss=2.08395):  69%|| 109/157 [00:35<00:15,  3.15it/s]

Validating... (loss=2.00865):  69%|| 109/157 [00:35<00:15,  3.15it/s]

Validating... (loss=2.00865):  70%|| 110/157 [00:35<00:14,  3.15it/s]

Validating... (loss=1.97192):  70%|| 110/157 [00:36<00:14,  3.15it/s]

Validating... (loss=1.97192):  71%|| 111/157 [00:36<00:14,  3.14it/s]

Validating... (loss=1.87703):  71%|| 111/157 [00:36<00:14,  3.14it/s]

Validating... (loss=1.87703):  71%|| 112/157 [00:36<00:14,  3.15it/s]

Validating... (loss=2.02453):  71%|| 112/157 [00:36<00:14,  3.15it/s]

Validating... (loss=2.02453):  72%|| 113/157 [00:36<00:13,  3.15it/s]

Validating... (loss=1.75588):  72%|| 113/157 [00:37<00:13,  3.15it/s]

Validating... (loss=1.75588):  73%|| 114/157 [00:37<00:13,  3.15it/s]

Validating... (loss=1.87399):  73%|| 114/157 [00:37<00:13,  3.15it/s]

Validating... (loss=1.87399):  73%|| 115/157 [00:37<00:13,  3.15it/s]

Validating... (loss=2.02692):  73%|| 115/157 [00:37<00:13,  3.15it/s]

Validating... (loss=2.02692):  74%|| 116/157 [00:37<00:12,  3.15it/s]

Validating... (loss=1.84696):  74%|| 116/157 [00:37<00:12,  3.15it/s]

Validating... (loss=1.84696):  75%|| 117/157 [00:37<00:12,  3.15it/s]

Validating... (loss=1.95146):  75%|| 117/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.95146):  75%|| 118/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.79462):  75%|| 118/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.79462):  76%|| 119/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.85544):  76%|| 119/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.85544):  76%|| 120/157 [00:38<00:11,  3.15it/s]

Validating... (loss=1.82963):  76%|| 120/157 [00:39<00:11,  3.15it/s]

Validating... (loss=1.82963):  77%|| 121/157 [00:39<00:11,  3.15it/s]

Validating... (loss=2.03755):  77%|| 121/157 [00:39<00:11,  3.15it/s]

Validating... (loss=2.03755):  78%|| 122/157 [00:39<00:11,  3.15it/s]

Validating... (loss=2.04232):  78%|| 122/157 [00:39<00:11,  3.15it/s]

Validating... (loss=2.04232):  78%|| 123/157 [00:39<00:10,  3.16it/s]

Validating... (loss=1.84090):  78%|| 123/157 [00:40<00:10,  3.16it/s]

Validating... (loss=1.84090):  79%|| 124/157 [00:40<00:10,  3.16it/s]

Validating... (loss=2.06802):  79%|| 124/157 [00:40<00:10,  3.16it/s]

Validating... (loss=2.06802):  80%|| 125/157 [00:40<00:10,  3.14it/s]

Validating... (loss=1.93155):  80%|| 125/157 [00:40<00:10,  3.14it/s]

Validating... (loss=1.93155):  80%|| 126/157 [00:40<00:09,  3.14it/s]

Validating... (loss=1.94621):  80%|| 126/157 [00:41<00:09,  3.14it/s]

Validating... (loss=1.94621):  81%|| 127/157 [00:41<00:09,  3.14it/s]

Validating... (loss=1.97257):  81%|| 127/157 [00:41<00:09,  3.14it/s]

Validating... (loss=1.97257):  82%|| 128/157 [00:41<00:09,  3.14it/s]

Validating... (loss=2.03256):  82%|| 128/157 [00:41<00:09,  3.14it/s]

Validating... (loss=2.03256):  82%|| 129/157 [00:41<00:08,  3.14it/s]

Validating... (loss=2.19086):  82%|| 129/157 [00:42<00:08,  3.14it/s]

Validating... (loss=2.19086):  83%|| 130/157 [00:42<00:08,  3.15it/s]

Validating... (loss=1.93640):  83%|| 130/157 [00:42<00:08,  3.15it/s]

Validating... (loss=1.93640):  83%|| 131/157 [00:42<00:08,  3.15it/s]

Validating... (loss=2.03343):  83%|| 131/157 [00:42<00:08,  3.15it/s]

Validating... (loss=2.03343):  84%|| 132/157 [00:42<00:07,  3.15it/s]

Validating... (loss=2.02568):  84%|| 132/157 [00:43<00:07,  3.15it/s]

Validating... (loss=2.02568):  85%|| 133/157 [00:43<00:07,  3.15it/s]

Validating... (loss=2.04730):  85%|| 133/157 [00:43<00:07,  3.15it/s]

Validating... (loss=2.04730):  85%|| 134/157 [00:43<00:07,  3.15it/s]

Validating... (loss=2.07601):  85%|| 134/157 [00:43<00:07,  3.15it/s]

Validating... (loss=2.07601):  86%|| 135/157 [00:43<00:06,  3.15it/s]

Validating... (loss=1.94973):  86%|| 135/157 [00:44<00:06,  3.15it/s]

Validating... (loss=1.94973):  87%|| 136/157 [00:44<00:06,  3.15it/s]

Validating... (loss=1.95277):  87%|| 136/157 [00:44<00:06,  3.15it/s]

Validating... (loss=1.95277):  87%|| 137/157 [00:44<00:06,  3.14it/s]

Validating... (loss=2.02534):  87%|| 137/157 [00:44<00:06,  3.14it/s]

Validating... (loss=2.02534):  88%|| 138/157 [00:44<00:06,  3.14it/s]

Validating... (loss=1.94868):  88%|| 138/157 [00:44<00:06,  3.14it/s]

Validating... (loss=1.94868):  89%|| 139/157 [00:44<00:05,  3.14it/s]

Validating... (loss=1.81014):  89%|| 139/157 [00:45<00:05,  3.14it/s]

Validating... (loss=1.81014):  89%|| 140/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.91197):  89%|| 140/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.91197):  90%|| 141/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.95140):  90%|| 141/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.95140):  90%|| 142/157 [00:45<00:04,  3.15it/s]

Validating... (loss=2.08792):  90%|| 142/157 [00:46<00:04,  3.15it/s]

Validating... (loss=2.08792):  91%|| 143/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.84247):  91%|| 143/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.84247):  92%|| 144/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.88032):  92%|| 144/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.88032):  92%|| 145/157 [00:46<00:03,  3.15it/s]

Validating... (loss=2.08074):  92%|| 145/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.08074):  93%|| 146/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.11587):  93%|| 146/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.11587):  94%|| 147/157 [00:47<00:03,  3.14it/s]

Validating... (loss=2.01737):  94%|| 147/157 [00:47<00:03,  3.14it/s]

Validating... (loss=2.01737):  94%|| 148/157 [00:47<00:02,  3.13it/s]

Validating... (loss=1.99677):  94%|| 148/157 [00:48<00:02,  3.13it/s]

Validating... (loss=1.99677):  95%|| 149/157 [00:48<00:02,  3.14it/s]

Validating... (loss=1.87059):  95%|| 149/157 [00:48<00:02,  3.14it/s]

Validating... (loss=1.87059):  96%|| 150/157 [00:48<00:02,  3.14it/s]

Validating... (loss=1.74333):  96%|| 150/157 [00:48<00:02,  3.14it/s]

Validating... (loss=1.74333):  96%|| 151/157 [00:48<00:01,  3.14it/s]

Validating... (loss=2.01439):  96%|| 151/157 [00:49<00:01,  3.14it/s]

Validating... (loss=2.01439):  97%|| 152/157 [00:49<00:01,  3.14it/s]

Validating... (loss=2.11559):  97%|| 152/157 [00:49<00:01,  3.14it/s]

Validating... (loss=2.11559):  97%|| 153/157 [00:49<00:01,  3.14it/s]

Validating... (loss=1.83151):  97%|| 153/157 [00:49<00:01,  3.14it/s]

Validating... (loss=1.83151):  98%|| 154/157 [00:49<00:00,  3.15it/s]

Validating... (loss=1.94628):  98%|| 154/157 [00:50<00:00,  3.15it/s]

Validating... (loss=1.94628):  99%|| 155/157 [00:50<00:00,  3.15it/s]

Validating... (loss=1.89522):  99%|| 155/157 [00:50<00:00,  3.15it/s]

Validating... (loss=1.89522):  99%|| 156/157 [00:50<00:00,  3.15it/s]

Validating... (loss=1.86536):  99%|| 156/157 [00:50<00:00,  3.15it/s]
Validating... (loss=1.86536): 100%|| 157/157 [00:50<00:00,  3.10it/s]
10/17/2022 06:25:22 - INFO - __main__ - 

10/17/2022 06:25:22 - INFO - __main__ - Validation Results
10/17/2022 06:25:22 - INFO - __main__ - Global Steps: 200
10/17/2022 06:25:22 - INFO - __main__ - Valid Loss: 1.96698
10/17/2022 06:25:22 - INFO - __main__ - Valid Accuracy: 0.26430

Training (200 / 500 Steps) (loss=2.05008):  26%|| 200/782 [05:23<2:37:39, 16.25s/it]
Training (201 / 500 Steps) (loss=2.12632):  26%|| 200/782 [05:24<2:37:39, 16.25s/it]
Training (201 / 500 Steps) (loss=2.12632):  26%|| 201/782 [05:24<1:53:26, 11.72s/it]
Training (202 / 500 Steps) (loss=1.94596):  26%|| 201/782 [05:25<1:53:26, 11.72s/it]
Training (202 / 500 Steps) (loss=1.94596):  26%|| 202/782 [05:25<1:22:19,  8.52s/it]
Training (203 / 500 Steps) (loss=2.10949):  26%|| 202/782 [05:26<1:22:19,  8.52s/it]
Training (203 / 500 Steps) (loss=2.10949):  26%|| 203/782 [05:26<1:00:36,  6.28s/it]
Training (204 / 500 Steps) (loss=2.02859):  26%|| 203/782 [05:27<1:00:36,  6.28s/it]
Training (204 / 500 Steps) (loss=2.02859):  26%|| 204/782 [05:27<45:23,  4.71s/it]  
Training (205 / 500 Steps) (loss=2.14163):  26%|| 204/782 [05:28<45:23,  4.71s/it]
Training (205 / 500 Steps) (loss=2.14163):  26%|| 205/782 [05:28<34:45,  3.62s/it]
Training (206 / 500 Steps) (loss=2.09244):  26%|| 205/782 [05:29<34:45,  3.62s/it]
Training (206 / 500 Steps) (loss=2.09244):  26%|| 206/782 [05:29<27:20,  2.85s/it]
Training (207 / 500 Steps) (loss=1.87694):  26%|| 206/782 [05:30<27:20,  2.85s/it]
Training (207 / 500 Steps) (loss=1.87694):  26%|| 207/782 [05:30<22:08,  2.31s/it]
Training (208 / 500 Steps) (loss=1.78510):  26%|| 207/782 [05:31<22:08,  2.31s/it]
Training (208 / 500 Steps) (loss=1.78510):  27%|| 208/782 [05:31<18:30,  1.93s/it]
Training (209 / 500 Steps) (loss=2.06522):  27%|| 208/782 [05:32<18:30,  1.93s/it]
Training (209 / 500 Steps) (loss=2.06522):  27%|| 209/782 [05:32<15:58,  1.67s/it]
Training (210 / 500 Steps) (loss=2.05199):  27%|| 209/782 [05:34<15:58,  1.67s/it]
Training (210 / 500 Steps) (loss=2.05199):  27%|| 210/782 [05:34<14:10,  1.49s/it]
Training (211 / 500 Steps) (loss=2.10144):  27%|| 210/782 [05:35<14:10,  1.49s/it]
Training (211 / 500 Steps) (loss=2.10144):  27%|| 211/782 [05:35<12:54,  1.36s/it]
Training (212 / 500 Steps) (loss=2.35444):  27%|| 211/782 [05:36<12:54,  1.36s/it]
Training (212 / 500 Steps) (loss=2.35444):  27%|| 212/782 [05:36<12:01,  1.27s/it]
Training (213 / 500 Steps) (loss=1.97209):  27%|| 212/782 [05:37<12:01,  1.27s/it]
Training (213 / 500 Steps) (loss=1.97209):  27%|| 213/782 [05:37<11:24,  1.20s/it]
Training (214 / 500 Steps) (loss=1.85312):  27%|| 213/782 [05:38<11:24,  1.20s/it]
Training (214 / 500 Steps) (loss=1.85312):  27%|| 214/782 [05:38<10:57,  1.16s/it]
Training (215 / 500 Steps) (loss=1.88482):  27%|| 214/782 [05:39<10:57,  1.16s/it]
Training (215 / 500 Steps) (loss=1.88482):  27%|| 215/782 [05:39<10:39,  1.13s/it]
Training (216 / 500 Steps) (loss=2.02050):  27%|| 215/782 [05:40<10:39,  1.13s/it]
Training (216 / 500 Steps) (loss=2.02050):  28%|| 216/782 [05:40<10:26,  1.11s/it]
Training (217 / 500 Steps) (loss=1.60192):  28%|| 216/782 [05:41<10:26,  1.11s/it]
Training (217 / 500 Steps) (loss=1.60192):  28%|| 217/782 [05:41<10:16,  1.09s/it]
Training (218 / 500 Steps) (loss=1.88820):  28%|| 217/782 [05:42<10:16,  1.09s/it]
Training (218 / 500 Steps) (loss=1.88820):  28%|| 218/782 [05:42<10:08,  1.08s/it]
Training (219 / 500 Steps) (loss=2.01510):  28%|| 218/782 [05:43<10:08,  1.08s/it]
Training (219 / 500 Steps) (loss=2.01510):  28%|| 219/782 [05:43<10:03,  1.07s/it]
Training (220 / 500 Steps) (loss=2.11141):  28%|| 219/782 [05:44<10:03,  1.07s/it]
Training (220 / 500 Steps) (loss=2.11141):  28%|| 220/782 [05:44<09:59,  1.07s/it]
Training (221 / 500 Steps) (loss=1.91405):  28%|| 220/782 [05:45<09:59,  1.07s/it]
Training (221 / 500 Steps) (loss=1.91405):  28%|| 221/782 [05:45<09:56,  1.06s/it]
Training (222 / 500 Steps) (loss=1.99978):  28%|| 221/782 [05:46<09:56,  1.06s/it]
Training (222 / 500 Steps) (loss=1.99978):  28%|| 222/782 [05:46<09:54,  1.06s/it]
Training (223 / 500 Steps) (loss=1.76857):  28%|| 222/782 [05:47<09:54,  1.06s/it]
Training (223 / 500 Steps) (loss=1.76857):  29%|| 223/782 [05:47<09:52,  1.06s/it]
Training (224 / 500 Steps) (loss=1.97839):  29%|| 223/782 [05:48<09:52,  1.06s/it]
Training (224 / 500 Steps) (loss=1.97839):  29%|| 224/782 [05:48<09:51,  1.06s/it]
Training (225 / 500 Steps) (loss=2.05140):  29%|| 224/782 [05:49<09:51,  1.06s/it]
Training (225 / 500 Steps) (loss=2.05140):  29%|| 225/782 [05:49<09:49,  1.06s/it]
Training (226 / 500 Steps) (loss=2.13434):  29%|| 225/782 [05:50<09:49,  1.06s/it]
Training (226 / 500 Steps) (loss=2.13434):  29%|| 226/782 [05:50<09:49,  1.06s/it]
Training (227 / 500 Steps) (loss=2.16188):  29%|| 226/782 [05:51<09:49,  1.06s/it]
Training (227 / 500 Steps) (loss=2.16188):  29%|| 227/782 [05:51<09:47,  1.06s/it]
Training (228 / 500 Steps) (loss=1.89479):  29%|| 227/782 [05:53<09:47,  1.06s/it]
Training (228 / 500 Steps) (loss=1.89479):  29%|| 228/782 [05:53<09:46,  1.06s/it]
Training (229 / 500 Steps) (loss=1.90865):  29%|| 228/782 [05:54<09:46,  1.06s/it]
Training (229 / 500 Steps) (loss=1.90865):  29%|| 229/782 [05:54<09:45,  1.06s/it]
Training (230 / 500 Steps) (loss=1.79332):  29%|| 229/782 [05:55<09:45,  1.06s/it]
Training (230 / 500 Steps) (loss=1.79332):  29%|| 230/782 [05:55<09:43,  1.06s/it]
Training (231 / 500 Steps) (loss=1.99006):  29%|| 230/782 [05:56<09:43,  1.06s/it]
Training (231 / 500 Steps) (loss=1.99006):  30%|| 231/782 [05:56<09:41,  1.06s/it]
Training (232 / 500 Steps) (loss=1.94787):  30%|| 231/782 [05:57<09:41,  1.06s/it]
Training (232 / 500 Steps) (loss=1.94787):  30%|| 232/782 [05:57<09:40,  1.06s/it]
Training (233 / 500 Steps) (loss=1.99138):  30%|| 232/782 [05:58<09:40,  1.06s/it]
Training (233 / 500 Steps) (loss=1.99138):  30%|| 233/782 [05:58<09:39,  1.06s/it]
Training (234 / 500 Steps) (loss=2.02266):  30%|| 233/782 [05:59<09:39,  1.06s/it]
Training (234 / 500 Steps) (loss=2.02266):  30%|| 234/782 [05:59<09:37,  1.05s/it]
Training (235 / 500 Steps) (loss=2.06278):  30%|| 234/782 [06:00<09:37,  1.05s/it]
Training (235 / 500 Steps) (loss=2.06278):  30%|| 235/782 [06:00<09:37,  1.06s/it]
Training (236 / 500 Steps) (loss=2.27304):  30%|| 235/782 [06:01<09:37,  1.06s/it]
Training (236 / 500 Steps) (loss=2.27304):  30%|| 236/782 [06:01<09:37,  1.06s/it]
Training (237 / 500 Steps) (loss=2.00760):  30%|| 236/782 [06:02<09:37,  1.06s/it]
Training (237 / 500 Steps) (loss=2.00760):  30%|| 237/782 [06:02<09:34,  1.05s/it]
Training (238 / 500 Steps) (loss=2.03417):  30%|| 237/782 [06:03<09:34,  1.05s/it]
Training (238 / 500 Steps) (loss=2.03417):  30%|| 238/782 [06:03<09:32,  1.05s/it]
Training (239 / 500 Steps) (loss=1.96944):  30%|| 238/782 [06:04<09:32,  1.05s/it]
Training (239 / 500 Steps) (loss=1.96944):  31%|| 239/782 [06:04<09:32,  1.05s/it]
Training (240 / 500 Steps) (loss=2.14292):  31%|| 239/782 [06:05<09:32,  1.05s/it]
Training (240 / 500 Steps) (loss=2.14292):  31%|| 240/782 [06:05<09:30,  1.05s/it]
Training (241 / 500 Steps) (loss=1.98754):  31%|| 240/782 [06:06<09:30,  1.05s/it]
Training (241 / 500 Steps) (loss=1.98754):  31%|| 241/782 [06:06<09:29,  1.05s/it]
Training (242 / 500 Steps) (loss=2.05536):  31%|| 241/782 [06:07<09:29,  1.05s/it]
Training (242 / 500 Steps) (loss=2.05536):  31%|| 242/782 [06:07<09:30,  1.06s/it]
Training (243 / 500 Steps) (loss=2.04153):  31%|| 242/782 [06:08<09:30,  1.06s/it]
Training (243 / 500 Steps) (loss=2.04153):  31%|| 243/782 [06:08<09:30,  1.06s/it]
Training (244 / 500 Steps) (loss=2.14946):  31%|| 243/782 [06:09<09:30,  1.06s/it]
Training (244 / 500 Steps) (loss=2.14946):  31%|| 244/782 [06:09<09:28,  1.06s/it]
Training (245 / 500 Steps) (loss=1.96383):  31%|| 244/782 [06:10<09:28,  1.06s/it]
Training (245 / 500 Steps) (loss=1.96383):  31%|| 245/782 [06:10<09:27,  1.06s/it]
Training (246 / 500 Steps) (loss=2.03497):  31%|| 245/782 [06:12<09:27,  1.06s/it]
Training (246 / 500 Steps) (loss=2.03497):  31%|| 246/782 [06:12<09:26,  1.06s/it]
Training (247 / 500 Steps) (loss=1.92874):  31%|| 246/782 [06:13<09:26,  1.06s/it]
Training (247 / 500 Steps) (loss=1.92874):  32%|| 247/782 [06:13<09:24,  1.06s/it]
Training (248 / 500 Steps) (loss=1.85689):  32%|| 247/782 [06:14<09:24,  1.06s/it]
Training (248 / 500 Steps) (loss=1.85689):  32%|| 248/782 [06:14<09:24,  1.06s/it]
Training (249 / 500 Steps) (loss=1.99965):  32%|| 248/782 [06:15<09:24,  1.06s/it]
Training (249 / 500 Steps) (loss=1.99965):  32%|| 249/782 [06:15<09:23,  1.06s/it]
Training (250 / 500 Steps) (loss=2.08973):  32%|| 249/782 [06:16<09:23,  1.06s/it]
Training (250 / 500 Steps) (loss=2.08973):  32%|| 250/782 [06:16<09:21,  1.06s/it]
Training (251 / 500 Steps) (loss=2.06123):  32%|| 250/782 [06:17<09:21,  1.06s/it]
Training (251 / 500 Steps) (loss=2.06123):  32%|| 251/782 [06:17<09:20,  1.06s/it]
Training (252 / 500 Steps) (loss=1.99003):  32%|| 251/782 [06:18<09:20,  1.06s/it]
Training (252 / 500 Steps) (loss=1.99003):  32%|| 252/782 [06:18<09:19,  1.06s/it]
Training (253 / 500 Steps) (loss=1.73769):  32%|| 252/782 [06:19<09:19,  1.06s/it]
Training (253 / 500 Steps) (loss=1.73769):  32%|| 253/782 [06:19<09:19,  1.06s/it]
Training (254 / 500 Steps) (loss=1.82252):  32%|| 253/782 [06:20<09:19,  1.06s/it]
Training (254 / 500 Steps) (loss=1.82252):  32%|| 254/782 [06:20<09:19,  1.06s/it]
Training (255 / 500 Steps) (loss=1.93429):  32%|| 254/782 [06:21<09:19,  1.06s/it]
Training (255 / 500 Steps) (loss=1.93429):  33%|| 255/782 [06:21<09:17,  1.06s/it]
Training (256 / 500 Steps) (loss=2.06057):  33%|| 255/782 [06:22<09:17,  1.06s/it]
Training (256 / 500 Steps) (loss=2.06057):  33%|| 256/782 [06:22<09:15,  1.06s/it]
Training (257 / 500 Steps) (loss=1.85133):  33%|| 256/782 [06:23<09:15,  1.06s/it]
Training (257 / 500 Steps) (loss=1.85133):  33%|| 257/782 [06:23<09:14,  1.06s/it]
Training (258 / 500 Steps) (loss=1.99923):  33%|| 257/782 [06:24<09:14,  1.06s/it]
Training (258 / 500 Steps) (loss=1.99923):  33%|| 258/782 [06:24<09:14,  1.06s/it]
Training (259 / 500 Steps) (loss=1.99778):  33%|| 258/782 [06:25<09:14,  1.06s/it]
Training (259 / 500 Steps) (loss=1.99778):  33%|| 259/782 [06:25<09:13,  1.06s/it]
Training (260 / 500 Steps) (loss=2.14207):  33%|| 259/782 [06:26<09:13,  1.06s/it]
Training (260 / 500 Steps) (loss=2.14207):  33%|| 260/782 [06:26<09:11,  1.06s/it]
Training (261 / 500 Steps) (loss=2.21205):  33%|| 260/782 [06:27<09:11,  1.06s/it]
Training (261 / 500 Steps) (loss=2.21205):  33%|| 261/782 [06:27<09:10,  1.06s/it]
Training (262 / 500 Steps) (loss=1.95899):  33%|| 261/782 [06:28<09:10,  1.06s/it]
Training (262 / 500 Steps) (loss=1.95899):  34%|| 262/782 [06:28<09:08,  1.05s/it]
Training (263 / 500 Steps) (loss=1.96854):  34%|| 262/782 [06:30<09:08,  1.05s/it]
Training (263 / 500 Steps) (loss=1.96854):  34%|| 263/782 [06:30<09:07,  1.06s/it]
Training (264 / 500 Steps) (loss=2.06311):  34%|| 263/782 [06:31<09:07,  1.06s/it]
Training (264 / 500 Steps) (loss=2.06311):  34%|| 264/782 [06:31<09:06,  1.05s/it]
Training (265 / 500 Steps) (loss=2.04558):  34%|| 264/782 [06:32<09:06,  1.05s/it]
Training (265 / 500 Steps) (loss=2.04558):  34%|| 265/782 [06:32<09:05,  1.06s/it]
Training (266 / 500 Steps) (loss=2.14753):  34%|| 265/782 [06:33<09:05,  1.06s/it]
Training (266 / 500 Steps) (loss=2.14753):  34%|| 266/782 [06:33<09:03,  1.05s/it]
Training (267 / 500 Steps) (loss=2.03828):  34%|| 266/782 [06:34<09:03,  1.05s/it]
Training (267 / 500 Steps) (loss=2.03828):  34%|| 267/782 [06:34<09:02,  1.05s/it]
Training (268 / 500 Steps) (loss=2.14944):  34%|| 267/782 [06:35<09:02,  1.05s/it]
Training (268 / 500 Steps) (loss=2.14944):  34%|| 268/782 [06:35<09:02,  1.06s/it]
Training (269 / 500 Steps) (loss=1.84228):  34%|| 268/782 [06:36<09:02,  1.06s/it]
Training (269 / 500 Steps) (loss=1.84228):  34%|| 269/782 [06:36<09:01,  1.06s/it]
Training (270 / 500 Steps) (loss=2.16035):  34%|| 269/782 [06:37<09:01,  1.06s/it]
Training (270 / 500 Steps) (loss=2.16035):  35%|| 270/782 [06:37<09:00,  1.06s/it]
Training (271 / 500 Steps) (loss=1.96049):  35%|| 270/782 [06:38<09:00,  1.06s/it]
Training (271 / 500 Steps) (loss=1.96049):  35%|| 271/782 [06:38<08:59,  1.06s/it]
Training (272 / 500 Steps) (loss=1.91395):  35%|| 271/782 [06:39<08:59,  1.06s/it]
Training (272 / 500 Steps) (loss=1.91395):  35%|| 272/782 [06:39<08:57,  1.05s/it]
Training (273 / 500 Steps) (loss=2.14400):  35%|| 272/782 [06:40<08:57,  1.05s/it]
Training (273 / 500 Steps) (loss=2.14400):  35%|| 273/782 [06:40<08:57,  1.06s/it]
Training (274 / 500 Steps) (loss=2.13317):  35%|| 273/782 [06:41<08:57,  1.06s/it]
Training (274 / 500 Steps) (loss=2.13317):  35%|| 274/782 [06:41<08:56,  1.06s/it]
Training (275 / 500 Steps) (loss=2.13384):  35%|| 274/782 [06:42<08:56,  1.06s/it]
Training (275 / 500 Steps) (loss=2.13384):  35%|| 275/782 [06:42<08:55,  1.06s/it]
Training (276 / 500 Steps) (loss=1.95855):  35%|| 275/782 [06:43<08:55,  1.06s/it]
Training (276 / 500 Steps) (loss=1.95855):  35%|| 276/782 [06:43<08:54,  1.06s/it]
Training (277 / 500 Steps) (loss=1.90718):  35%|| 276/782 [06:44<08:54,  1.06s/it]
Training (277 / 500 Steps) (loss=1.90718):  35%|| 277/782 [06:44<08:53,  1.06s/it]
Training (278 / 500 Steps) (loss=2.04846):  35%|| 277/782 [06:45<08:53,  1.06s/it]
Training (278 / 500 Steps) (loss=2.04846):  36%|| 278/782 [06:45<08:51,  1.06s/it]
Training (279 / 500 Steps) (loss=2.01744):  36%|| 278/782 [06:46<08:51,  1.06s/it]
Training (279 / 500 Steps) (loss=2.01744):  36%|| 279/782 [06:46<08:51,  1.06s/it]
Training (280 / 500 Steps) (loss=1.97312):  36%|| 279/782 [06:47<08:51,  1.06s/it]
Training (280 / 500 Steps) (loss=1.97312):  36%|| 280/782 [06:47<08:50,  1.06s/it]
Training (281 / 500 Steps) (loss=1.93115):  36%|| 280/782 [06:49<08:50,  1.06s/it]
Training (281 / 500 Steps) (loss=1.93115):  36%|| 281/782 [06:49<08:48,  1.06s/it]
Training (282 / 500 Steps) (loss=1.88073):  36%|| 281/782 [06:50<08:48,  1.06s/it]
Training (282 / 500 Steps) (loss=1.88073):  36%|| 282/782 [06:50<08:47,  1.05s/it]
Training (283 / 500 Steps) (loss=1.84949):  36%|| 282/782 [06:51<08:47,  1.05s/it]
Training (283 / 500 Steps) (loss=1.84949):  36%|| 283/782 [06:51<08:47,  1.06s/it]
Training (284 / 500 Steps) (loss=1.94354):  36%|| 283/782 [06:52<08:47,  1.06s/it]
Training (284 / 500 Steps) (loss=1.94354):  36%|| 284/782 [06:52<08:47,  1.06s/it]
Training (285 / 500 Steps) (loss=1.89749):  36%|| 284/782 [06:53<08:47,  1.06s/it]
Training (285 / 500 Steps) (loss=1.89749):  36%|| 285/782 [06:53<08:45,  1.06s/it]
Training (286 / 500 Steps) (loss=1.90503):  36%|| 285/782 [06:54<08:45,  1.06s/it]
Training (286 / 500 Steps) (loss=1.90503):  37%|| 286/782 [06:54<08:43,  1.06s/it]
Training (287 / 500 Steps) (loss=1.85695):  37%|| 286/782 [06:55<08:43,  1.06s/it]
Training (287 / 500 Steps) (loss=1.85695):  37%|| 287/782 [06:55<08:43,  1.06s/it]
Training (288 / 500 Steps) (loss=1.85663):  37%|| 287/782 [06:56<08:43,  1.06s/it]
Training (288 / 500 Steps) (loss=1.85663):  37%|| 288/782 [06:56<08:41,  1.06s/it]
Training (289 / 500 Steps) (loss=1.91940):  37%|| 288/782 [06:57<08:41,  1.06s/it]
Training (289 / 500 Steps) (loss=1.91940):  37%|| 289/782 [06:57<08:39,  1.05s/it]
Training (290 / 500 Steps) (loss=2.10429):  37%|| 289/782 [06:58<08:39,  1.05s/it]
Training (290 / 500 Steps) (loss=2.10429):  37%|| 290/782 [06:58<08:39,  1.05s/it]
Training (291 / 500 Steps) (loss=1.95231):  37%|| 290/782 [06:59<08:39,  1.05s/it]
Training (291 / 500 Steps) (loss=1.95231):  37%|| 291/782 [06:59<08:41,  1.06s/it]
Training (292 / 500 Steps) (loss=2.22323):  37%|| 291/782 [07:00<08:41,  1.06s/it]
Training (292 / 500 Steps) (loss=2.22323):  37%|| 292/782 [07:00<08:39,  1.06s/it]
Training (293 / 500 Steps) (loss=1.92913):  37%|| 292/782 [07:01<08:39,  1.06s/it]
Training (293 / 500 Steps) (loss=1.92913):  37%|| 293/782 [07:01<08:36,  1.06s/it]
Training (294 / 500 Steps) (loss=1.99205):  37%|| 293/782 [07:02<08:36,  1.06s/it]
Training (294 / 500 Steps) (loss=1.99205):  38%|| 294/782 [07:02<08:34,  1.05s/it]
Training (295 / 500 Steps) (loss=1.92366):  38%|| 294/782 [07:03<08:34,  1.05s/it]
Training (295 / 500 Steps) (loss=1.92366):  38%|| 295/782 [07:03<08:33,  1.05s/it]
Training (296 / 500 Steps) (loss=1.79736):  38%|| 295/782 [07:04<08:33,  1.05s/it]
Training (296 / 500 Steps) (loss=1.79736):  38%|| 296/782 [07:04<08:33,  1.06s/it]
Training (297 / 500 Steps) (loss=1.98151):  38%|| 296/782 [07:05<08:33,  1.06s/it]
Training (297 / 500 Steps) (loss=1.98151):  38%|| 297/782 [07:05<08:31,  1.05s/it]
Training (298 / 500 Steps) (loss=2.01004):  38%|| 297/782 [07:06<08:31,  1.05s/it]
Training (298 / 500 Steps) (loss=2.01004):  38%|| 298/782 [07:06<08:30,  1.05s/it]
Training (299 / 500 Steps) (loss=1.97227):  38%|| 298/782 [07:08<08:30,  1.05s/it]
Training (299 / 500 Steps) (loss=1.97227):  38%|| 299/782 [07:08<08:29,  1.06s/it]
Training (300 / 500 Steps) (loss=1.95520):  38%|| 299/782 [07:09<08:29,  1.06s/it]10/17/2022 06:27:08 - INFO - __main__ - ***** Running Validation *****
10/17/2022 06:27:08 - INFO - __main__ -   Num steps = 157
10/17/2022 06:27:08 - INFO - __main__ -   Batch size = 64


Validating... (loss=X.X):   0%|| 0/157 [00:00<?, ?it/s]

Validating... (loss=1.74051):   0%|| 0/157 [00:00<?, ?it/s]

Validating... (loss=1.74051):   1%|| 1/157 [00:00<02:35,  1.00it/s]

Validating... (loss=1.94550):   1%|| 1/157 [00:01<02:35,  1.00it/s]

Validating... (loss=1.94550):   1%|| 2/157 [00:01<01:32,  1.67it/s]

Validating... (loss=1.91521):   1%|| 2/157 [00:01<01:32,  1.67it/s]

Validating... (loss=1.91521):   2%|| 3/157 [00:01<01:12,  2.13it/s]

Validating... (loss=1.93796):   2%|| 3/157 [00:01<01:12,  2.13it/s]

Validating... (loss=1.93796):   3%|| 4/157 [00:01<01:02,  2.44it/s]

Validating... (loss=1.80326):   3%|| 4/157 [00:02<01:02,  2.44it/s]

Validating... (loss=1.80326):   3%|| 5/157 [00:02<00:57,  2.66it/s]

Validating... (loss=2.08818):   3%|| 5/157 [00:02<00:57,  2.66it/s]

Validating... (loss=2.08818):   4%|| 6/157 [00:02<00:53,  2.81it/s]

Validating... (loss=1.93346):   4%|| 6/157 [00:02<00:53,  2.81it/s]

Validating... (loss=1.93346):   4%|| 7/157 [00:02<00:51,  2.91it/s]

Validating... (loss=1.84548):   4%|| 7/157 [00:03<00:51,  2.91it/s]

Validating... (loss=1.84548):   5%|| 8/157 [00:03<00:49,  2.98it/s]

Validating... (loss=1.76575):   5%|| 8/157 [00:03<00:49,  2.98it/s]

Validating... (loss=1.76575):   6%|| 9/157 [00:03<00:48,  3.03it/s]

Validating... (loss=1.85976):   6%|| 9/157 [00:03<00:48,  3.03it/s]

Validating... (loss=1.85976):   6%|| 10/157 [00:03<00:47,  3.07it/s]

Validating... (loss=2.12798):   6%|| 10/157 [00:04<00:47,  3.07it/s]

Validating... (loss=2.12798):   7%|| 11/157 [00:04<00:47,  3.10it/s]

Validating... (loss=2.11478):   7%|| 11/157 [00:04<00:47,  3.10it/s]

Validating... (loss=2.11478):   8%|| 12/157 [00:04<00:46,  3.12it/s]

Validating... (loss=1.96772):   8%|| 12/157 [00:04<00:46,  3.12it/s]

Validating... (loss=1.96772):   8%|| 13/157 [00:04<00:46,  3.13it/s]

Validating... (loss=1.70579):   8%|| 13/157 [00:05<00:46,  3.13it/s]

Validating... (loss=1.70579):   9%|| 14/157 [00:05<00:45,  3.14it/s]

Validating... (loss=1.85668):   9%|| 14/157 [00:05<00:45,  3.14it/s]

Validating... (loss=1.85668):  10%|| 15/157 [00:05<00:45,  3.14it/s]

Validating... (loss=2.01628):  10%|| 15/157 [00:05<00:45,  3.14it/s]

Validating... (loss=2.01628):  10%|| 16/157 [00:05<00:44,  3.15it/s]

Validating... (loss=1.95797):  10%|| 16/157 [00:06<00:44,  3.15it/s]

Validating... (loss=1.95797):  11%|| 17/157 [00:06<00:44,  3.15it/s]

Validating... (loss=2.02704):  11%|| 17/157 [00:06<00:44,  3.15it/s]

Validating... (loss=2.02704):  11%|| 18/157 [00:06<00:44,  3.15it/s]

Validating... (loss=2.06031):  11%|| 18/157 [00:06<00:44,  3.15it/s]

Validating... (loss=2.06031):  12%|| 19/157 [00:06<00:43,  3.16it/s]

Validating... (loss=2.02338):  12%|| 19/157 [00:07<00:43,  3.16it/s]

Validating... (loss=2.02338):  13%|| 20/157 [00:07<00:43,  3.16it/s]

Validating... (loss=1.89929):  13%|| 20/157 [00:07<00:43,  3.16it/s]

Validating... (loss=1.89929):  13%|| 21/157 [00:07<00:43,  3.16it/s]

Validating... (loss=1.80500):  13%|| 21/157 [00:07<00:43,  3.16it/s]

Validating... (loss=1.80500):  14%|| 22/157 [00:07<00:42,  3.16it/s]

Validating... (loss=1.89748):  14%|| 22/157 [00:07<00:42,  3.16it/s]

Validating... (loss=1.89748):  15%|| 23/157 [00:07<00:42,  3.15it/s]

Validating... (loss=1.98123):  15%|| 23/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.98123):  15%|| 24/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.85197):  15%|| 24/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.85197):  16%|| 25/157 [00:08<00:41,  3.15it/s]

Validating... (loss=1.93344):  16%|| 25/157 [00:08<00:41,  3.15it/s]

Validating... (loss=1.93344):  17%|| 26/157 [00:08<00:41,  3.15it/s]

Validating... (loss=1.80806):  17%|| 26/157 [00:09<00:41,  3.15it/s]

Validating... (loss=1.80806):  17%|| 27/157 [00:09<00:41,  3.15it/s]

Validating... (loss=1.76988):  17%|| 27/157 [00:09<00:41,  3.15it/s]

Validating... (loss=1.76988):  18%|| 28/157 [00:09<00:40,  3.15it/s]

Validating... (loss=1.91612):  18%|| 28/157 [00:09<00:40,  3.15it/s]

Validating... (loss=1.91612):  18%|| 29/157 [00:09<00:40,  3.15it/s]

Validating... (loss=1.95412):  18%|| 29/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.95412):  19%|| 30/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.92645):  19%|| 30/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.92645):  20%|| 31/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.77091):  20%|| 31/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.77091):  20%|| 32/157 [00:10<00:39,  3.14it/s]

Validating... (loss=1.85544):  20%|| 32/157 [00:11<00:39,  3.14it/s]

Validating... (loss=1.85544):  21%|| 33/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.77719):  21%|| 33/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.77719):  22%|| 34/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.84026):  22%|| 34/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.84026):  22%|| 35/157 [00:11<00:38,  3.15it/s]

Validating... (loss=1.90637):  22%|| 35/157 [00:12<00:38,  3.15it/s]

Validating... (loss=1.90637):  23%|| 36/157 [00:12<00:38,  3.15it/s]

Validating... (loss=1.88216):  23%|| 36/157 [00:12<00:38,  3.15it/s]

Validating... (loss=1.88216):  24%|| 37/157 [00:12<00:38,  3.15it/s]

Validating... (loss=1.87932):  24%|| 37/157 [00:12<00:38,  3.15it/s]

Validating... (loss=1.87932):  24%|| 38/157 [00:12<00:37,  3.15it/s]

Validating... (loss=1.70579):  24%|| 38/157 [00:13<00:37,  3.15it/s]

Validating... (loss=1.70579):  25%|| 39/157 [00:13<00:37,  3.15it/s]

Validating... (loss=2.12970):  25%|| 39/157 [00:13<00:37,  3.15it/s]

Validating... (loss=2.12970):  25%|| 40/157 [00:13<00:37,  3.15it/s]

Validating... (loss=1.93897):  25%|| 40/157 [00:13<00:37,  3.15it/s]

Validating... (loss=1.93897):  26%|| 41/157 [00:13<00:37,  3.09it/s]

Validating... (loss=1.92836):  26%|| 41/157 [00:14<00:37,  3.09it/s]

Validating... (loss=1.92836):  27%|| 42/157 [00:14<00:36,  3.11it/s]

Validating... (loss=1.82009):  27%|| 42/157 [00:14<00:36,  3.11it/s]

Validating... (loss=1.82009):  27%|| 43/157 [00:14<00:36,  3.12it/s]

Validating... (loss=1.93455):  27%|| 43/157 [00:14<00:36,  3.12it/s]

Validating... (loss=1.93455):  28%|| 44/157 [00:14<00:36,  3.13it/s]

Validating... (loss=1.92452):  28%|| 44/157 [00:14<00:36,  3.13it/s]

Validating... (loss=1.92452):  29%|| 45/157 [00:14<00:35,  3.14it/s]

Validating... (loss=1.93601):  29%|| 45/157 [00:15<00:35,  3.14it/s]

Validating... (loss=1.93601):  29%|| 46/157 [00:15<00:35,  3.13it/s]

Validating... (loss=1.69015):  29%|| 46/157 [00:15<00:35,  3.13it/s]

Validating... (loss=1.69015):  30%|| 47/157 [00:15<00:35,  3.14it/s]

Validating... (loss=1.99590):  30%|| 47/157 [00:15<00:35,  3.14it/s]

Validating... (loss=1.99590):  31%|| 48/157 [00:15<00:34,  3.14it/s]

Validating... (loss=1.96674):  31%|| 48/157 [00:16<00:34,  3.14it/s]

Validating... (loss=1.96674):  31%|| 49/157 [00:16<00:34,  3.15it/s]

Validating... (loss=1.86901):  31%|| 49/157 [00:16<00:34,  3.15it/s]

Validating... (loss=1.86901):  32%|| 50/157 [00:16<00:33,  3.15it/s]

Validating... (loss=2.01921):  32%|| 50/157 [00:16<00:33,  3.15it/s]

Validating... (loss=2.01921):  32%|| 51/157 [00:16<00:33,  3.14it/s]

Validating... (loss=1.77837):  32%|| 51/157 [00:17<00:33,  3.14it/s]

Validating... (loss=1.77837):  33%|| 52/157 [00:17<00:33,  3.14it/s]

Validating... (loss=1.89053):  33%|| 52/157 [00:17<00:33,  3.14it/s]

Validating... (loss=1.89053):  34%|| 53/157 [00:17<00:33,  3.14it/s]

Validating... (loss=1.96314):  34%|| 53/157 [00:17<00:33,  3.14it/s]

Validating... (loss=1.96314):  34%|| 54/157 [00:17<00:32,  3.15it/s]

Validating... (loss=1.89796):  34%|| 54/157 [00:18<00:32,  3.15it/s]

Validating... (loss=1.89796):  35%|| 55/157 [00:18<00:32,  3.15it/s]

Validating... (loss=1.83964):  35%|| 55/157 [00:18<00:32,  3.15it/s]

Validating... (loss=1.83964):  36%|| 56/157 [00:18<00:32,  3.14it/s]

Validating... (loss=2.00765):  36%|| 56/157 [00:18<00:32,  3.14it/s]

Validating... (loss=2.00765):  36%|| 57/157 [00:18<00:31,  3.14it/s]

Validating... (loss=1.86412):  36%|| 57/157 [00:19<00:31,  3.14it/s]

Validating... (loss=1.86412):  37%|| 58/157 [00:19<00:31,  3.14it/s]

Validating... (loss=1.82168):  37%|| 58/157 [00:19<00:31,  3.14it/s]

Validating... (loss=1.82168):  38%|| 59/157 [00:19<00:31,  3.13it/s]

Validating... (loss=1.81249):  38%|| 59/157 [00:19<00:31,  3.13it/s]

Validating... (loss=1.81249):  38%|| 60/157 [00:19<00:30,  3.13it/s]

Validating... (loss=1.90982):  38%|| 60/157 [00:20<00:30,  3.13it/s]

Validating... (loss=1.90982):  39%|| 61/157 [00:20<00:30,  3.14it/s]

Validating... (loss=1.84437):  39%|| 61/157 [00:20<00:30,  3.14it/s]

Validating... (loss=1.84437):  39%|| 62/157 [00:20<00:30,  3.14it/s]

Validating... (loss=2.04677):  39%|| 62/157 [00:20<00:30,  3.14it/s]

Validating... (loss=2.04677):  40%|| 63/157 [00:20<00:29,  3.15it/s]

Validating... (loss=1.90773):  40%|| 63/157 [00:21<00:29,  3.15it/s]

Validating... (loss=1.90773):  41%|| 64/157 [00:21<00:29,  3.15it/s]

Validating... (loss=1.99470):  41%|| 64/157 [00:21<00:29,  3.15it/s]

Validating... (loss=1.99470):  41%|| 65/157 [00:21<00:29,  3.14it/s]

Validating... (loss=1.98250):  41%|| 65/157 [00:21<00:29,  3.14it/s]

Validating... (loss=1.98250):  42%|| 66/157 [00:21<00:28,  3.14it/s]

Validating... (loss=1.85633):  42%|| 66/157 [00:21<00:28,  3.14it/s]

Validating... (loss=1.85633):  43%|| 67/157 [00:21<00:28,  3.14it/s]

Validating... (loss=1.96337):  43%|| 67/157 [00:22<00:28,  3.14it/s]

Validating... (loss=1.96337):  43%|| 68/157 [00:22<00:28,  3.15it/s]

Validating... (loss=1.73684):  43%|| 68/157 [00:22<00:28,  3.15it/s]

Validating... (loss=1.73684):  44%|| 69/157 [00:22<00:27,  3.15it/s]

Validating... (loss=1.77984):  44%|| 69/157 [00:22<00:27,  3.15it/s]

Validating... (loss=1.77984):  45%|| 70/157 [00:22<00:27,  3.15it/s]

Validating... (loss=1.92817):  45%|| 70/157 [00:23<00:27,  3.15it/s]

Validating... (loss=1.92817):  45%|| 71/157 [00:23<00:27,  3.15it/s]

Validating... (loss=1.95751):  45%|| 71/157 [00:23<00:27,  3.15it/s]

Validating... (loss=1.95751):  46%|| 72/157 [00:23<00:27,  3.15it/s]

Validating... (loss=1.89996):  46%|| 72/157 [00:23<00:27,  3.15it/s]

Validating... (loss=1.89996):  46%|| 73/157 [00:23<00:26,  3.14it/s]

Validating... (loss=1.86465):  46%|| 73/157 [00:24<00:26,  3.14it/s]

Validating... (loss=1.86465):  47%|| 74/157 [00:24<00:26,  3.13it/s]

Validating... (loss=2.09248):  47%|| 74/157 [00:24<00:26,  3.13it/s]

Validating... (loss=2.09248):  48%|| 75/157 [00:24<00:26,  3.12it/s]

Validating... (loss=1.71068):  48%|| 75/157 [00:24<00:26,  3.12it/s]

Validating... (loss=1.71068):  48%|| 76/157 [00:24<00:25,  3.13it/s]

Validating... (loss=1.85470):  48%|| 76/157 [00:25<00:25,  3.13it/s]

Validating... (loss=1.85470):  49%|| 77/157 [00:25<00:25,  3.14it/s]

Validating... (loss=1.77784):  49%|| 77/157 [00:25<00:25,  3.14it/s]

Validating... (loss=1.77784):  50%|| 78/157 [00:25<00:25,  3.14it/s]

Validating... (loss=1.88033):  50%|| 78/157 [00:25<00:25,  3.14it/s]

Validating... (loss=1.88033):  50%|| 79/157 [00:25<00:24,  3.15it/s]

Validating... (loss=1.81123):  50%|| 79/157 [00:26<00:24,  3.15it/s]

Validating... (loss=1.81123):  51%|| 80/157 [00:26<00:24,  3.15it/s]

Validating... (loss=1.84143):  51%|| 80/157 [00:26<00:24,  3.15it/s]

Validating... (loss=1.84143):  52%|| 81/157 [00:26<00:24,  3.15it/s]

Validating... (loss=2.09802):  52%|| 81/157 [00:26<00:24,  3.15it/s]

Validating... (loss=2.09802):  52%|| 82/157 [00:26<00:23,  3.15it/s]

Validating... (loss=1.78890):  52%|| 82/157 [00:27<00:23,  3.15it/s]

Validating... (loss=1.78890):  53%|| 83/157 [00:27<00:23,  3.15it/s]

Validating... (loss=2.15170):  53%|| 83/157 [00:27<00:23,  3.15it/s]

Validating... (loss=2.15170):  54%|| 84/157 [00:27<00:23,  3.15it/s]

Validating... (loss=2.07753):  54%|| 84/157 [00:27<00:23,  3.15it/s]

Validating... (loss=2.07753):  54%|| 85/157 [00:27<00:22,  3.14it/s]

Validating... (loss=1.90614):  54%|| 85/157 [00:28<00:22,  3.14it/s]

Validating... (loss=1.90614):  55%|| 86/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.05686):  55%|| 86/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.05686):  55%|| 87/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.07404):  55%|| 87/157 [00:28<00:22,  3.15it/s]

Validating... (loss=2.07404):  56%|| 88/157 [00:28<00:21,  3.15it/s]

Validating... (loss=1.88027):  56%|| 88/157 [00:28<00:21,  3.15it/s]

Validating... (loss=1.88027):  57%|| 89/157 [00:28<00:21,  3.15it/s]

Validating... (loss=1.64836):  57%|| 89/157 [00:29<00:21,  3.15it/s]

Validating... (loss=1.64836):  57%|| 90/157 [00:29<00:21,  3.15it/s]

Validating... (loss=1.82254):  57%|| 90/157 [00:29<00:21,  3.15it/s]

Validating... (loss=1.82254):  58%|| 91/157 [00:29<00:20,  3.15it/s]

Validating... (loss=1.82882):  58%|| 91/157 [00:29<00:20,  3.15it/s]

Validating... (loss=1.82882):  59%|| 92/157 [00:29<00:20,  3.15it/s]

Validating... (loss=1.89135):  59%|| 92/157 [00:30<00:20,  3.15it/s]

Validating... (loss=1.89135):  59%|| 93/157 [00:30<00:20,  3.16it/s]

Validating... (loss=1.94459):  59%|| 93/157 [00:30<00:20,  3.16it/s]

Validating... (loss=1.94459):  60%|| 94/157 [00:30<00:19,  3.15it/s]

Validating... (loss=1.83353):  60%|| 94/157 [00:30<00:19,  3.15it/s]

Validating... (loss=1.83353):  61%|| 95/157 [00:30<00:19,  3.15it/s]

Validating... (loss=1.94608):  61%|| 95/157 [00:31<00:19,  3.15it/s]

Validating... (loss=1.94608):  61%|| 96/157 [00:31<00:19,  3.14it/s]

Validating... (loss=1.86277):  61%|| 96/157 [00:31<00:19,  3.14it/s]

Validating... (loss=1.86277):  62%|| 97/157 [00:31<00:19,  3.15it/s]

Validating... (loss=1.77488):  62%|| 97/157 [00:31<00:19,  3.15it/s]

Validating... (loss=1.77488):  62%|| 98/157 [00:31<00:18,  3.15it/s]

Validating... (loss=1.74892):  62%|| 98/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.74892):  63%|| 99/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.81646):  63%|| 99/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.81646):  64%|| 100/157 [00:32<00:18,  3.15it/s]

Validating... (loss=2.03450):  64%|| 100/157 [00:32<00:18,  3.15it/s]

Validating... (loss=2.03450):  64%|| 101/157 [00:32<00:17,  3.15it/s]

Validating... (loss=1.93874):  64%|| 101/157 [00:33<00:17,  3.15it/s]

Validating... (loss=1.93874):  65%|| 102/157 [00:33<00:17,  3.15it/s]

Validating... (loss=2.02325):  65%|| 102/157 [00:33<00:17,  3.15it/s]

Validating... (loss=2.02325):  66%|| 103/157 [00:33<00:17,  3.15it/s]

Validating... (loss=1.92673):  66%|| 103/157 [00:33<00:17,  3.15it/s]

Validating... (loss=1.92673):  66%|| 104/157 [00:33<00:16,  3.15it/s]

Validating... (loss=1.90154):  66%|| 104/157 [00:34<00:16,  3.15it/s]

Validating... (loss=1.90154):  67%|| 105/157 [00:34<00:16,  3.15it/s]

Validating... (loss=1.86568):  67%|| 105/157 [00:34<00:16,  3.15it/s]

Validating... (loss=1.86568):  68%|| 106/157 [00:34<00:16,  3.16it/s]

Validating... (loss=1.94116):  68%|| 106/157 [00:34<00:16,  3.16it/s]

Validating... (loss=1.94116):  68%|| 107/157 [00:34<00:15,  3.15it/s]

Validating... (loss=1.79316):  68%|| 107/157 [00:34<00:15,  3.15it/s]

Validating... (loss=1.79316):  69%|| 108/157 [00:35<00:15,  3.15it/s]

Validating... (loss=1.99008):  69%|| 108/157 [00:35<00:15,  3.15it/s]

Validating... (loss=1.99008):  69%|| 109/157 [00:35<00:15,  3.15it/s]

Validating... (loss=2.11913):  69%|| 109/157 [00:35<00:15,  3.15it/s]

Validating... (loss=2.11913):  70%|| 110/157 [00:35<00:14,  3.15it/s]

Validating... (loss=1.90226):  70%|| 110/157 [00:35<00:14,  3.15it/s]

Validating... (loss=1.90226):  71%|| 111/157 [00:35<00:14,  3.15it/s]

Validating... (loss=1.76940):  71%|| 111/157 [00:36<00:14,  3.15it/s]

Validating... (loss=1.76940):  71%|| 112/157 [00:36<00:14,  3.16it/s]

Validating... (loss=1.95462):  71%|| 112/157 [00:36<00:14,  3.16it/s]

Validating... (loss=1.95462):  72%|| 113/157 [00:36<00:13,  3.16it/s]

Validating... (loss=1.75250):  72%|| 113/157 [00:36<00:13,  3.16it/s]

Validating... (loss=1.75250):  73%|| 114/157 [00:36<00:13,  3.16it/s]

Validating... (loss=1.87371):  73%|| 114/157 [00:37<00:13,  3.16it/s]

Validating... (loss=1.87371):  73%|| 115/157 [00:37<00:13,  3.16it/s]

Validating... (loss=1.96445):  73%|| 115/157 [00:37<00:13,  3.16it/s]

Validating... (loss=1.96445):  74%|| 116/157 [00:37<00:12,  3.16it/s]

Validating... (loss=1.84916):  74%|| 116/157 [00:37<00:12,  3.16it/s]

Validating... (loss=1.84916):  75%|| 117/157 [00:37<00:12,  3.16it/s]

Validating... (loss=1.85318):  75%|| 117/157 [00:38<00:12,  3.16it/s]

Validating... (loss=1.85318):  75%|| 118/157 [00:38<00:12,  3.16it/s]

Validating... (loss=1.88683):  75%|| 118/157 [00:38<00:12,  3.16it/s]

Validating... (loss=1.88683):  76%|| 119/157 [00:38<00:12,  3.16it/s]

Validating... (loss=1.92504):  76%|| 119/157 [00:38<00:12,  3.16it/s]

Validating... (loss=1.92504):  76%|| 120/157 [00:38<00:11,  3.15it/s]

Validating... (loss=1.68969):  76%|| 120/157 [00:39<00:11,  3.15it/s]

Validating... (loss=1.68969):  77%|| 121/157 [00:39<00:11,  3.13it/s]

Validating... (loss=1.79679):  77%|| 121/157 [00:39<00:11,  3.13it/s]

Validating... (loss=1.79679):  78%|| 122/157 [00:39<00:11,  3.13it/s]

Validating... (loss=2.00427):  78%|| 122/157 [00:39<00:11,  3.13it/s]

Validating... (loss=2.00427):  78%|| 123/157 [00:39<00:10,  3.13it/s]

Validating... (loss=1.83826):  78%|| 123/157 [00:40<00:10,  3.13it/s]

Validating... (loss=1.83826):  79%|| 124/157 [00:40<00:10,  3.14it/s]

Validating... (loss=2.14827):  79%|| 124/157 [00:40<00:10,  3.14it/s]

Validating... (loss=2.14827):  80%|| 125/157 [00:40<00:10,  3.13it/s]

Validating... (loss=1.90138):  80%|| 125/157 [00:40<00:10,  3.13it/s]

Validating... (loss=1.90138):  80%|| 126/157 [00:40<00:09,  3.13it/s]

Validating... (loss=1.97920):  80%|| 126/157 [00:41<00:09,  3.13it/s]

Validating... (loss=1.97920):  81%|| 127/157 [00:41<00:09,  3.14it/s]

Validating... (loss=2.02901):  81%|| 127/157 [00:41<00:09,  3.14it/s]

Validating... (loss=2.02901):  82%|| 128/157 [00:41<00:09,  3.14it/s]

Validating... (loss=1.99525):  82%|| 128/157 [00:41<00:09,  3.14it/s]

Validating... (loss=1.99525):  82%|| 129/157 [00:41<00:08,  3.14it/s]

Validating... (loss=2.08761):  82%|| 129/157 [00:41<00:08,  3.14it/s]

Validating... (loss=2.08761):  83%|| 130/157 [00:41<00:08,  3.15it/s]

Validating... (loss=2.02047):  83%|| 130/157 [00:42<00:08,  3.15it/s]

Validating... (loss=2.02047):  83%|| 131/157 [00:42<00:08,  3.15it/s]

Validating... (loss=1.94343):  83%|| 131/157 [00:42<00:08,  3.15it/s]

Validating... (loss=1.94343):  84%|| 132/157 [00:42<00:07,  3.15it/s]

Validating... (loss=1.87227):  84%|| 132/157 [00:42<00:07,  3.15it/s]

Validating... (loss=1.87227):  85%|| 133/157 [00:42<00:07,  3.15it/s]

Validating... (loss=2.04350):  85%|| 133/157 [00:43<00:07,  3.15it/s]

Validating... (loss=2.04350):  85%|| 134/157 [00:43<00:07,  3.13it/s]

Validating... (loss=2.01108):  85%|| 134/157 [00:43<00:07,  3.13it/s]

Validating... (loss=2.01108):  86%|| 135/157 [00:43<00:07,  3.14it/s]

Validating... (loss=2.01507):  86%|| 135/157 [00:43<00:07,  3.14it/s]

Validating... (loss=2.01507):  87%|| 136/157 [00:43<00:06,  3.14it/s]

Validating... (loss=1.90596):  87%|| 136/157 [00:44<00:06,  3.14it/s]

Validating... (loss=1.90596):  87%|| 137/157 [00:44<00:06,  3.14it/s]

Validating... (loss=1.91750):  87%|| 137/157 [00:44<00:06,  3.14it/s]

Validating... (loss=1.91750):  88%|| 138/157 [00:44<00:06,  3.14it/s]

Validating... (loss=1.95017):  88%|| 138/157 [00:44<00:06,  3.14it/s]

Validating... (loss=1.95017):  89%|| 139/157 [00:44<00:05,  3.12it/s]

Validating... (loss=1.74848):  89%|| 139/157 [00:45<00:05,  3.12it/s]

Validating... (loss=1.74848):  89%|| 140/157 [00:45<00:05,  3.12it/s]

Validating... (loss=1.90873):  89%|| 140/157 [00:45<00:05,  3.12it/s]

Validating... (loss=1.90873):  90%|| 141/157 [00:45<00:05,  3.13it/s]

Validating... (loss=1.81075):  90%|| 141/157 [00:45<00:05,  3.13it/s]

Validating... (loss=1.81075):  90%|| 142/157 [00:45<00:04,  3.13it/s]

Validating... (loss=1.89650):  90%|| 142/157 [00:46<00:04,  3.13it/s]

Validating... (loss=1.89650):  91%|| 143/157 [00:46<00:04,  3.14it/s]

Validating... (loss=1.78574):  91%|| 143/157 [00:46<00:04,  3.14it/s]

Validating... (loss=1.78574):  92%|| 144/157 [00:46<00:04,  3.14it/s]

Validating... (loss=1.80532):  92%|| 144/157 [00:46<00:04,  3.14it/s]

Validating... (loss=1.80532):  92%|| 145/157 [00:46<00:03,  3.14it/s]

Validating... (loss=2.03257):  92%|| 145/157 [00:47<00:03,  3.14it/s]

Validating... (loss=2.03257):  93%|| 146/157 [00:47<00:03,  3.14it/s]

Validating... (loss=1.93141):  93%|| 146/157 [00:47<00:03,  3.14it/s]

Validating... (loss=1.93141):  94%|| 147/157 [00:47<00:03,  3.14it/s]

Validating... (loss=1.88460):  94%|| 147/157 [00:47<00:03,  3.14it/s]

Validating... (loss=1.88460):  94%|| 148/157 [00:47<00:02,  3.13it/s]

Validating... (loss=1.91514):  94%|| 148/157 [00:48<00:02,  3.13it/s]

Validating... (loss=1.91514):  95%|| 149/157 [00:48<00:02,  3.14it/s]

Validating... (loss=1.80398):  95%|| 149/157 [00:48<00:02,  3.14it/s]

Validating... (loss=1.80398):  96%|| 150/157 [00:48<00:02,  3.14it/s]

Validating... (loss=1.82532):  96%|| 150/157 [00:48<00:02,  3.14it/s]

Validating... (loss=1.82532):  96%|| 151/157 [00:48<00:01,  3.15it/s]

Validating... (loss=1.75397):  96%|| 151/157 [00:49<00:01,  3.15it/s]

Validating... (loss=1.75397):  97%|| 152/157 [00:49<00:01,  3.15it/s]

Validating... (loss=1.91018):  97%|| 152/157 [00:49<00:01,  3.15it/s]

Validating... (loss=1.91018):  97%|| 153/157 [00:49<00:01,  3.13it/s]

Validating... (loss=1.90276):  97%|| 153/157 [00:49<00:01,  3.13it/s]

Validating... (loss=1.90276):  98%|| 154/157 [00:49<00:00,  3.13it/s]

Validating... (loss=2.03853):  98%|| 154/157 [00:49<00:00,  3.13it/s]

Validating... (loss=2.03853):  99%|| 155/157 [00:49<00:00,  3.14it/s]

Validating... (loss=1.80498):  99%|| 155/157 [00:50<00:00,  3.14it/s]

Validating... (loss=1.80498):  99%|| 156/157 [00:50<00:00,  3.14it/s]

Validating... (loss=1.94980):  99%|| 156/157 [00:50<00:00,  3.14it/s]
Validating... (loss=1.94980): 100%|| 157/157 [00:50<00:00,  3.10it/s]
10/17/2022 06:27:59 - INFO - __main__ - 

10/17/2022 06:27:59 - INFO - __main__ - Validation Results
10/17/2022 06:27:59 - INFO - __main__ - Global Steps: 300
10/17/2022 06:27:59 - INFO - __main__ - Valid Loss: 1.90445
10/17/2022 06:27:59 - INFO - __main__ - Valid Accuracy: 0.29210
10/17/2022 06:27:59 - INFO - __main__ - Saved model checkpoint to [DIR: output]

Training (300 / 500 Steps) (loss=1.95520):  38%|| 300/782 [08:00<2:11:53, 16.42s/it]
Training (301 / 500 Steps) (loss=1.97980):  38%|| 300/782 [08:01<2:11:53, 16.42s/it]
Training (301 / 500 Steps) (loss=1.97980):  38%|| 301/782 [08:01<1:34:40, 11.81s/it]
Training (302 / 500 Steps) (loss=1.96988):  38%|| 301/782 [08:02<1:34:40, 11.81s/it]
Training (302 / 500 Steps) (loss=1.96988):  39%|| 302/782 [08:02<1:08:40,  8.58s/it]
Training (303 / 500 Steps) (loss=1.97272):  39%|| 302/782 [08:03<1:08:40,  8.58s/it]
Training (303 / 500 Steps) (loss=1.97272):  39%|| 303/782 [08:03<50:29,  6.32s/it]  
Training (304 / 500 Steps) (loss=2.08011):  39%|| 303/782 [08:04<50:29,  6.32s/it]
Training (304 / 500 Steps) (loss=2.08011):  39%|| 304/782 [08:04<37:48,  4.75s/it]
Training (305 / 500 Steps) (loss=2.07634):  39%|| 304/782 [08:05<37:48,  4.75s/it]
Training (305 / 500 Steps) (loss=2.07634):  39%|| 305/782 [08:05<28:56,  3.64s/it]
Training (306 / 500 Steps) (loss=2.02154):  39%|| 305/782 [08:06<28:56,  3.64s/it]
Training (306 / 500 Steps) (loss=2.02154):  39%|| 306/782 [08:06<22:43,  2.86s/it]
Training (307 / 500 Steps) (loss=1.94040):  39%|| 306/782 [08:07<22:43,  2.86s/it]
Training (307 / 500 Steps) (loss=1.94040):  39%|| 307/782 [08:07<18:23,  2.32s/it]
Training (308 / 500 Steps) (loss=2.10281):  39%|| 307/782 [08:08<18:23,  2.32s/it]
Training (308 / 500 Steps) (loss=2.10281):  39%|| 308/782 [08:08<15:20,  1.94s/it]
Training (309 / 500 Steps) (loss=1.78979):  39%|| 308/782 [08:09<15:20,  1.94s/it]
Training (309 / 500 Steps) (loss=1.78979):  40%|| 309/782 [08:09<13:13,  1.68s/it]
Training (310 / 500 Steps) (loss=2.01700):  40%|| 309/782 [08:10<13:13,  1.68s/it]
Training (310 / 500 Steps) (loss=2.01700):  40%|| 310/782 [08:10<11:44,  1.49s/it]
Training (311 / 500 Steps) (loss=1.89591):  40%|| 310/782 [08:11<11:44,  1.49s/it]
Training (311 / 500 Steps) (loss=1.89591):  40%|| 311/782 [08:11<10:41,  1.36s/it]
Training (312 / 500 Steps) (loss=2.05680):  40%|| 311/782 [08:12<10:41,  1.36s/it]
Training (312 / 500 Steps) (loss=2.05680):  40%|| 312/782 [08:12<09:57,  1.27s/it]
Training (313 / 500 Steps) (loss=2.10323):  40%|| 312/782 [08:14<09:57,  1.27s/it]
Training (313 / 500 Steps) (loss=2.10323):  40%|| 313/782 [08:14<09:25,  1.21s/it]
Training (314 / 500 Steps) (loss=1.81534):  40%|| 313/782 [08:15<09:25,  1.21s/it]
Training (314 / 500 Steps) (loss=1.81534):  40%|| 314/782 [08:15<09:02,  1.16s/it]
Training (315 / 500 Steps) (loss=1.94621):  40%|| 314/782 [08:16<09:02,  1.16s/it]
Training (315 / 500 Steps) (loss=1.94621):  40%|| 315/782 [08:16<08:48,  1.13s/it]
Training (316 / 500 Steps) (loss=1.90576):  40%|| 315/782 [08:17<08:48,  1.13s/it]
Training (316 / 500 Steps) (loss=1.90576):  40%|| 316/782 [08:17<08:37,  1.11s/it]
Training (317 / 500 Steps) (loss=1.91867):  40%|| 316/782 [08:18<08:37,  1.11s/it]
Training (317 / 500 Steps) (loss=1.91867):  41%|| 317/782 [08:18<08:28,  1.09s/it]
Training (318 / 500 Steps) (loss=2.05223):  41%|| 317/782 [08:19<08:28,  1.09s/it]
Training (318 / 500 Steps) (loss=2.05223):  41%|| 318/782 [08:19<08:21,  1.08s/it]
Training (319 / 500 Steps) (loss=1.88224):  41%|| 318/782 [08:20<08:21,  1.08s/it]
Training (319 / 500 Steps) (loss=1.88224):  41%|| 319/782 [08:20<08:16,  1.07s/it]
Training (320 / 500 Steps) (loss=2.15004):  41%|| 319/782 [08:21<08:16,  1.07s/it]
Training (320 / 500 Steps) (loss=2.15004):  41%|| 320/782 [08:21<08:13,  1.07s/it]
Training (321 / 500 Steps) (loss=1.90741):  41%|| 320/782 [08:22<08:13,  1.07s/it]
Training (321 / 500 Steps) (loss=1.90741):  41%|| 321/782 [08:22<08:10,  1.07s/it]
Training (322 / 500 Steps) (loss=1.92045):  41%|| 321/782 [08:23<08:10,  1.07s/it]
Training (322 / 500 Steps) (loss=1.92045):  41%|| 322/782 [08:23<08:08,  1.06s/it]
Training (323 / 500 Steps) (loss=2.01661):  41%|| 322/782 [08:24<08:08,  1.06s/it]
Training (323 / 500 Steps) (loss=2.01661):  41%|| 323/782 [08:24<08:06,  1.06s/it]
Training (324 / 500 Steps) (loss=1.99049):  41%|| 323/782 [08:25<08:06,  1.06s/it]
Training (324 / 500 Steps) (loss=1.99049):  41%|| 324/782 [08:25<08:05,  1.06s/it]
Training (325 / 500 Steps) (loss=1.84951):  41%|| 324/782 [08:26<08:05,  1.06s/it]
Training (325 / 500 Steps) (loss=1.84951):  42%|| 325/782 [08:26<08:04,  1.06s/it]
Training (326 / 500 Steps) (loss=1.96981):  42%|| 325/782 [08:27<08:04,  1.06s/it]
Training (326 / 500 Steps) (loss=1.96981):  42%|| 326/782 [08:27<08:02,  1.06s/it]
Training (327 / 500 Steps) (loss=2.04749):  42%|| 326/782 [08:28<08:02,  1.06s/it]
Training (327 / 500 Steps) (loss=2.04749):  42%|| 327/782 [08:28<08:01,  1.06s/it]
Training (328 / 500 Steps) (loss=1.93142):  42%|| 327/782 [08:29<08:01,  1.06s/it]
Training (328 / 500 Steps) (loss=1.93142):  42%|| 328/782 [08:29<08:00,  1.06s/it]
Training (329 / 500 Steps) (loss=1.89657):  42%|| 328/782 [08:30<08:00,  1.06s/it]
Training (329 / 500 Steps) (loss=1.89657):  42%|| 329/782 [08:30<07:58,  1.06s/it]
Training (330 / 500 Steps) (loss=2.14422):  42%|| 329/782 [08:31<07:58,  1.06s/it]
Training (330 / 500 Steps) (loss=2.14422):  42%|| 330/782 [08:32<07:57,  1.06s/it]
Training (331 / 500 Steps) (loss=2.19003):  42%|| 330/782 [08:33<07:57,  1.06s/it]
Training (331 / 500 Steps) (loss=2.19003):  42%|| 331/782 [08:33<07:56,  1.06s/it]
Training (332 / 500 Steps) (loss=1.92249):  42%|| 331/782 [08:34<07:56,  1.06s/it]
Training (332 / 500 Steps) (loss=1.92249):  42%|| 332/782 [08:34<07:54,  1.05s/it]
Training (333 / 500 Steps) (loss=1.94126):  42%|| 332/782 [08:35<07:54,  1.05s/it]
Training (333 / 500 Steps) (loss=1.94126):  43%|| 333/782 [08:35<07:53,  1.06s/it]
Training (334 / 500 Steps) (loss=1.88320):  43%|| 333/782 [08:36<07:53,  1.06s/it]
Training (334 / 500 Steps) (loss=1.88320):  43%|| 334/782 [08:36<07:52,  1.05s/it]
Training (335 / 500 Steps) (loss=2.17963):  43%|| 334/782 [08:37<07:52,  1.05s/it]
Training (335 / 500 Steps) (loss=2.17963):  43%|| 335/782 [08:37<07:51,  1.05s/it]
Training (336 / 500 Steps) (loss=2.02458):  43%|| 335/782 [08:38<07:51,  1.05s/it]
Training (336 / 500 Steps) (loss=2.02458):  43%|| 336/782 [08:38<07:50,  1.06s/it]
Training (337 / 500 Steps) (loss=1.93668):  43%|| 336/782 [08:39<07:50,  1.06s/it]
Training (337 / 500 Steps) (loss=1.93668):  43%|| 337/782 [08:39<07:49,  1.05s/it]
Training (338 / 500 Steps) (loss=1.87164):  43%|| 337/782 [08:40<07:49,  1.05s/it]
Training (338 / 500 Steps) (loss=1.87164):  43%|| 338/782 [08:40<07:47,  1.05s/it]
Training (339 / 500 Steps) (loss=2.01528):  43%|| 338/782 [08:41<07:47,  1.05s/it]
Training (339 / 500 Steps) (loss=2.01528):  43%|| 339/782 [08:41<07:47,  1.06s/it]
Training (340 / 500 Steps) (loss=1.69119):  43%|| 339/782 [08:42<07:47,  1.06s/it]
Training (340 / 500 Steps) (loss=1.69119):  43%|| 340/782 [08:42<07:46,  1.06s/it]
Training (341 / 500 Steps) (loss=1.87557):  43%|| 340/782 [08:43<07:46,  1.06s/it]
Training (341 / 500 Steps) (loss=1.87557):  44%|| 341/782 [08:43<07:45,  1.05s/it]
Training (342 / 500 Steps) (loss=1.86229):  44%|| 341/782 [08:44<07:45,  1.05s/it]
Training (342 / 500 Steps) (loss=1.86229):  44%|| 342/782 [08:44<07:44,  1.06s/it]
Training (343 / 500 Steps) (loss=2.11287):  44%|| 342/782 [08:45<07:44,  1.06s/it]
Training (343 / 500 Steps) (loss=2.11287):  44%|| 343/782 [08:45<07:44,  1.06s/it]
Training (344 / 500 Steps) (loss=1.97134):  44%|| 343/782 [08:46<07:44,  1.06s/it]
Training (344 / 500 Steps) (loss=1.97134):  44%|| 344/782 [08:46<07:42,  1.06s/it]
Training (345 / 500 Steps) (loss=2.28430):  44%|| 344/782 [08:47<07:42,  1.06s/it]
Training (345 / 500 Steps) (loss=2.28430):  44%|| 345/782 [08:47<07:41,  1.06s/it]
Training (346 / 500 Steps) (loss=1.97085):  44%|| 345/782 [08:48<07:41,  1.06s/it]
Training (346 / 500 Steps) (loss=1.97085):  44%|| 346/782 [08:48<07:39,  1.05s/it]
Training (347 / 500 Steps) (loss=1.97957):  44%|| 346/782 [08:49<07:39,  1.05s/it]
Training (347 / 500 Steps) (loss=1.97957):  44%|| 347/782 [08:49<07:38,  1.05s/it]
Training (348 / 500 Steps) (loss=1.87450):  44%|| 347/782 [08:50<07:38,  1.05s/it]
Training (348 / 500 Steps) (loss=1.87450):  45%|| 348/782 [08:50<07:37,  1.06s/it]
Training (349 / 500 Steps) (loss=1.98243):  45%|| 348/782 [08:52<07:37,  1.06s/it]
Training (349 / 500 Steps) (loss=1.98243):  45%|| 349/782 [08:52<07:36,  1.06s/it]
Training (350 / 500 Steps) (loss=2.10160):  45%|| 349/782 [08:53<07:36,  1.06s/it]
Training (350 / 500 Steps) (loss=2.10160):  45%|| 350/782 [08:53<07:35,  1.05s/it]
Training (351 / 500 Steps) (loss=1.91279):  45%|| 350/782 [08:54<07:35,  1.05s/it]
Training (351 / 500 Steps) (loss=1.91279):  45%|| 351/782 [08:54<07:34,  1.06s/it]
Training (352 / 500 Steps) (loss=2.11991):  45%|| 351/782 [08:55<07:34,  1.06s/it]
Training (352 / 500 Steps) (loss=2.11991):  45%|| 352/782 [08:55<07:33,  1.06s/it]
Training (353 / 500 Steps) (loss=1.94100):  45%|| 352/782 [08:56<07:33,  1.06s/it]
Training (353 / 500 Steps) (loss=1.94100):  45%|| 353/782 [08:56<07:32,  1.05s/it]
Training (354 / 500 Steps) (loss=2.09078):  45%|| 353/782 [08:57<07:32,  1.05s/it]
Training (354 / 500 Steps) (loss=2.09078):  45%|| 354/782 [08:57<07:32,  1.06s/it]
Training (355 / 500 Steps) (loss=1.75458):  45%|| 354/782 [08:58<07:32,  1.06s/it]
Training (355 / 500 Steps) (loss=1.75458):  45%|| 355/782 [08:58<07:31,  1.06s/it]
Training (356 / 500 Steps) (loss=2.06115):  45%|| 355/782 [08:59<07:31,  1.06s/it]
Training (356 / 500 Steps) (loss=2.06115):  46%|| 356/782 [08:59<07:29,  1.06s/it]
Training (357 / 500 Steps) (loss=1.98823):  46%|| 356/782 [09:00<07:29,  1.06s/it]
Training (357 / 500 Steps) (loss=1.98823):  46%|| 357/782 [09:00<07:29,  1.06s/it]
Training (358 / 500 Steps) (loss=2.03390):  46%|| 357/782 [09:01<07:29,  1.06s/it]
Training (358 / 500 Steps) (loss=2.03390):  46%|| 358/782 [09:01<07:28,  1.06s/it]
Training (359 / 500 Steps) (loss=1.71245):  46%|| 358/782 [09:02<07:28,  1.06s/it]
Training (359 / 500 Steps) (loss=1.71245):  46%|| 359/782 [09:02<07:27,  1.06s/it]
Training (360 / 500 Steps) (loss=1.83565):  46%|| 359/782 [09:03<07:27,  1.06s/it]
Training (360 / 500 Steps) (loss=1.83565):  46%|| 360/782 [09:03<07:26,  1.06s/it]
Training (361 / 500 Steps) (loss=2.14797):  46%|| 360/782 [09:04<07:26,  1.06s/it]
Training (361 / 500 Steps) (loss=2.14797):  46%|| 361/782 [09:04<07:25,  1.06s/it]
Training (362 / 500 Steps) (loss=1.93768):  46%|| 361/782 [09:05<07:25,  1.06s/it]
Training (362 / 500 Steps) (loss=1.93768):  46%|| 362/782 [09:05<07:24,  1.06s/it]
Training (363 / 500 Steps) (loss=1.96917):  46%|| 362/782 [09:06<07:24,  1.06s/it]
Training (363 / 500 Steps) (loss=1.96917):  46%|| 363/782 [09:06<07:23,  1.06s/it]
Training (364 / 500 Steps) (loss=1.94844):  46%|| 363/782 [09:07<07:23,  1.06s/it]
Training (364 / 500 Steps) (loss=1.94844):  47%|| 364/782 [09:07<07:22,  1.06s/it]
Training (365 / 500 Steps) (loss=2.08034):  47%|| 364/782 [09:08<07:22,  1.06s/it]
Training (365 / 500 Steps) (loss=2.08034):  47%|| 365/782 [09:08<07:20,  1.06s/it]
Training (366 / 500 Steps) (loss=1.96160):  47%|| 365/782 [09:10<07:20,  1.06s/it]
Training (366 / 500 Steps) (loss=1.96160):  47%|| 366/782 [09:10<07:19,  1.06s/it]
Training (367 / 500 Steps) (loss=1.96750):  47%|| 366/782 [09:11<07:19,  1.06s/it]
Training (367 / 500 Steps) (loss=1.96750):  47%|| 367/782 [09:11<07:19,  1.06s/it]
Training (368 / 500 Steps) (loss=1.95197):  47%|| 367/782 [09:12<07:19,  1.06s/it]
Training (368 / 500 Steps) (loss=1.95197):  47%|| 368/782 [09:12<07:17,  1.06s/it]
Training (369 / 500 Steps) (loss=2.10688):  47%|| 368/782 [09:13<07:17,  1.06s/it]
Training (369 / 500 Steps) (loss=2.10688):  47%|| 369/782 [09:13<07:16,  1.06s/it]
Training (370 / 500 Steps) (loss=1.93285):  47%|| 369/782 [09:14<07:16,  1.06s/it]
Training (370 / 500 Steps) (loss=1.93285):  47%|| 370/782 [09:14<07:14,  1.06s/it]
Training (371 / 500 Steps) (loss=2.03484):  47%|| 370/782 [09:15<07:14,  1.06s/it]
Training (371 / 500 Steps) (loss=2.03484):  47%|| 371/782 [09:15<07:14,  1.06s/it]
Training (372 / 500 Steps) (loss=1.95535):  47%|| 371/782 [09:16<07:14,  1.06s/it]
Training (372 / 500 Steps) (loss=1.95535):  48%|| 372/782 [09:16<07:12,  1.06s/it]
Training (373 / 500 Steps) (loss=2.17152):  48%|| 372/782 [09:17<07:12,  1.06s/it]
Training (373 / 500 Steps) (loss=2.17152):  48%|| 373/782 [09:17<07:12,  1.06s/it]
Training (374 / 500 Steps) (loss=1.87618):  48%|| 373/782 [09:18<07:12,  1.06s/it]
Training (374 / 500 Steps) (loss=1.87618):  48%|| 374/782 [09:18<07:10,  1.06s/it]
Training (375 / 500 Steps) (loss=1.83806):  48%|| 374/782 [09:19<07:10,  1.06s/it]
Training (375 / 500 Steps) (loss=1.83806):  48%|| 375/782 [09:19<07:10,  1.06s/it]
Training (376 / 500 Steps) (loss=2.11034):  48%|| 375/782 [09:20<07:10,  1.06s/it]
Training (376 / 500 Steps) (loss=2.11034):  48%|| 376/782 [09:20<07:09,  1.06s/it]
Training (377 / 500 Steps) (loss=1.98773):  48%|| 376/782 [09:21<07:09,  1.06s/it]
Training (377 / 500 Steps) (loss=1.98773):  48%|| 377/782 [09:21<07:08,  1.06s/it]
Training (378 / 500 Steps) (loss=1.92362):  48%|| 377/782 [09:22<07:08,  1.06s/it]
Training (378 / 500 Steps) (loss=1.92362):  48%|| 378/782 [09:22<07:07,  1.06s/it]
Training (379 / 500 Steps) (loss=1.74808):  48%|| 378/782 [09:23<07:07,  1.06s/it]
Training (379 / 500 Steps) (loss=1.74808):  48%|| 379/782 [09:23<07:06,  1.06s/it]
Training (380 / 500 Steps) (loss=1.94710):  48%|| 379/782 [09:24<07:06,  1.06s/it]
Training (380 / 500 Steps) (loss=1.94710):  49%|| 380/782 [09:24<07:05,  1.06s/it]
Training (381 / 500 Steps) (loss=1.79743):  49%|| 380/782 [09:25<07:05,  1.06s/it]
Training (381 / 500 Steps) (loss=1.79743):  49%|| 381/782 [09:25<07:03,  1.06s/it]
Training (382 / 500 Steps) (loss=2.00946):  49%|| 381/782 [09:26<07:03,  1.06s/it]
Training (382 / 500 Steps) (loss=2.00946):  49%|| 382/782 [09:26<07:02,  1.06s/it]
Training (383 / 500 Steps) (loss=1.99566):  49%|| 382/782 [09:27<07:02,  1.06s/it]
Training (383 / 500 Steps) (loss=1.99566):  49%|| 383/782 [09:27<07:01,  1.06s/it]
Training (384 / 500 Steps) (loss=1.91057):  49%|| 383/782 [09:29<07:01,  1.06s/it]
Training (384 / 500 Steps) (loss=1.91057):  49%|| 384/782 [09:29<07:00,  1.06s/it]
Training (385 / 500 Steps) (loss=1.92448):  49%|| 384/782 [09:30<07:00,  1.06s/it]
Training (385 / 500 Steps) (loss=1.92448):  49%|| 385/782 [09:30<06:59,  1.06s/it]
Training (386 / 500 Steps) (loss=1.95313):  49%|| 385/782 [09:31<06:59,  1.06s/it]
Training (386 / 500 Steps) (loss=1.95313):  49%|| 386/782 [09:31<06:59,  1.06s/it]
Training (387 / 500 Steps) (loss=2.06591):  49%|| 386/782 [09:32<06:59,  1.06s/it]
Training (387 / 500 Steps) (loss=2.06591):  49%|| 387/782 [09:32<06:58,  1.06s/it]
Training (388 / 500 Steps) (loss=1.96678):  49%|| 387/782 [09:33<06:58,  1.06s/it]
Training (388 / 500 Steps) (loss=1.96678):  50%|| 388/782 [09:33<06:56,  1.06s/it]
Training (389 / 500 Steps) (loss=2.02239):  50%|| 388/782 [09:34<06:56,  1.06s/it]
Training (389 / 500 Steps) (loss=2.02239):  50%|| 389/782 [09:34<06:55,  1.06s/it]
Training (390 / 500 Steps) (loss=1.93361):  50%|| 389/782 [09:35<06:55,  1.06s/it]
Training (390 / 500 Steps) (loss=1.93361):  50%|| 390/782 [09:35<06:54,  1.06s/it]
Training (391 / 500 Steps) (loss=1.97034):  50%|| 390/782 [09:36<06:54,  1.06s/it]
Training (391 / 500 Steps) (loss=1.97034):  50%|| 391/782 [09:36<06:54,  1.06s/it]
Training (392 / 500 Steps) (loss=1.88353):  50%|| 391/782 [09:37<06:54,  1.06s/it]
Training (392 / 500 Steps) (loss=1.88353):  50%|| 392/782 [09:37<06:53,  1.06s/it]
Training (393 / 500 Steps) (loss=1.70544):  50%|| 392/782 [09:38<06:53,  1.06s/it]
Training (393 / 500 Steps) (loss=1.70544):  50%|| 393/782 [09:38<06:51,  1.06s/it]
Training (394 / 500 Steps) (loss=1.89521):  50%|| 393/782 [09:39<06:51,  1.06s/it]
Training (394 / 500 Steps) (loss=1.89521):  50%|| 394/782 [09:39<06:50,  1.06s/it]
Training (395 / 500 Steps) (loss=2.14199):  50%|| 394/782 [09:40<06:50,  1.06s/it]
Training (395 / 500 Steps) (loss=2.14199):  51%|| 395/782 [09:40<06:49,  1.06s/it]
Training (396 / 500 Steps) (loss=1.89216):  51%|| 395/782 [09:41<06:49,  1.06s/it]
Training (396 / 500 Steps) (loss=1.89216):  51%|| 396/782 [09:41<06:48,  1.06s/it]
Training (397 / 500 Steps) (loss=1.88507):  51%|| 396/782 [09:42<06:48,  1.06s/it]
Training (397 / 500 Steps) (loss=1.88507):  51%|| 397/782 [09:42<06:46,  1.06s/it]
Training (398 / 500 Steps) (loss=1.82834):  51%|| 397/782 [09:43<06:46,  1.06s/it]
Training (398 / 500 Steps) (loss=1.82834):  51%|| 398/782 [09:43<06:45,  1.06s/it]
Training (399 / 500 Steps) (loss=1.97002):  51%|| 398/782 [09:44<06:45,  1.06s/it]
Training (399 / 500 Steps) (loss=1.97002):  51%|| 399/782 [09:44<06:44,  1.06s/it]
Training (400 / 500 Steps) (loss=2.14746):  51%|| 399/782 [09:45<06:44,  1.06s/it]10/17/2022 06:29:45 - INFO - __main__ - ***** Running Validation *****
10/17/2022 06:29:45 - INFO - __main__ -   Num steps = 157
10/17/2022 06:29:45 - INFO - __main__ -   Batch size = 64


Validating... (loss=X.X):   0%|| 0/157 [00:00<?, ?it/s]

Validating... (loss=1.73796):   0%|| 0/157 [00:01<?, ?it/s]

Validating... (loss=1.73796):   1%|| 1/157 [00:01<02:49,  1.09s/it]

Validating... (loss=1.91460):   1%|| 1/157 [00:01<02:49,  1.09s/it]

Validating... (loss=1.91460):   1%|| 2/157 [00:01<01:38,  1.58it/s]

Validating... (loss=1.92525):   1%|| 2/157 [00:01<01:38,  1.58it/s]

Validating... (loss=1.92525):   2%|| 3/157 [00:01<01:15,  2.05it/s]

Validating... (loss=1.86382):   2%|| 3/157 [00:02<01:15,  2.05it/s]

Validating... (loss=1.86382):   3%|| 4/157 [00:02<01:04,  2.37it/s]

Validating... (loss=1.80617):   3%|| 4/157 [00:02<01:04,  2.37it/s]

Validating... (loss=1.80617):   3%|| 5/157 [00:02<00:58,  2.60it/s]

Validating... (loss=2.01123):   3%|| 5/157 [00:02<00:58,  2.60it/s]

Validating... (loss=2.01123):   4%|| 6/157 [00:02<00:54,  2.77it/s]

Validating... (loss=1.99965):   4%|| 6/157 [00:02<00:54,  2.77it/s]

Validating... (loss=1.99965):   4%|| 7/157 [00:02<00:52,  2.88it/s]

Validating... (loss=1.76716):   4%|| 7/157 [00:03<00:52,  2.88it/s]

Validating... (loss=1.76716):   5%|| 8/157 [00:03<00:50,  2.96it/s]

Validating... (loss=1.89321):   5%|| 8/157 [00:03<00:50,  2.96it/s]

Validating... (loss=1.89321):   6%|| 9/157 [00:03<00:48,  3.02it/s]

Validating... (loss=1.83784):   6%|| 9/157 [00:03<00:48,  3.02it/s]

Validating... (loss=1.83784):   6%|| 10/157 [00:03<00:48,  3.06it/s]

Validating... (loss=2.12573):   6%|| 10/157 [00:04<00:48,  3.06it/s]

Validating... (loss=2.12573):   7%|| 11/157 [00:04<00:47,  3.09it/s]

Validating... (loss=1.96208):   7%|| 11/157 [00:04<00:47,  3.09it/s]

Validating... (loss=1.96208):   8%|| 12/157 [00:04<00:46,  3.11it/s]

Validating... (loss=1.99180):   8%|| 12/157 [00:04<00:46,  3.11it/s]

Validating... (loss=1.99180):   8%|| 13/157 [00:04<00:46,  3.12it/s]

Validating... (loss=1.94278):   8%|| 13/157 [00:05<00:46,  3.12it/s]

Validating... (loss=1.94278):   9%|| 14/157 [00:05<00:45,  3.13it/s]

Validating... (loss=1.98411):   9%|| 14/157 [00:05<00:45,  3.13it/s]

Validating... (loss=1.98411):  10%|| 15/157 [00:05<00:45,  3.14it/s]

Validating... (loss=1.76993):  10%|| 15/157 [00:05<00:45,  3.14it/s]

Validating... (loss=1.76993):  10%|| 16/157 [00:05<00:44,  3.15it/s]

Validating... (loss=2.17332):  10%|| 16/157 [00:06<00:44,  3.15it/s]

Validating... (loss=2.17332):  11%|| 17/157 [00:06<00:44,  3.15it/s]

Validating... (loss=2.00582):  11%|| 17/157 [00:06<00:44,  3.15it/s]

Validating... (loss=2.00582):  11%|| 18/157 [00:06<00:44,  3.15it/s]

Validating... (loss=1.95249):  11%|| 18/157 [00:06<00:44,  3.15it/s]

Validating... (loss=1.95249):  12%|| 19/157 [00:06<00:43,  3.15it/s]

Validating... (loss=2.02633):  12%|| 19/157 [00:07<00:43,  3.15it/s]

Validating... (loss=2.02633):  13%|| 20/157 [00:07<00:43,  3.15it/s]

Validating... (loss=1.99304):  13%|| 20/157 [00:07<00:43,  3.15it/s]

Validating... (loss=1.99304):  13%|| 21/157 [00:07<00:43,  3.15it/s]

Validating... (loss=1.90442):  13%|| 21/157 [00:07<00:43,  3.15it/s]

Validating... (loss=1.90442):  14%|| 22/157 [00:07<00:42,  3.15it/s]

Validating... (loss=1.83915):  14%|| 22/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.83915):  15%|| 23/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.87055):  15%|| 23/157 [00:08<00:42,  3.15it/s]

Validating... (loss=1.87055):  15%|| 24/157 [00:08<00:42,  3.16it/s]

Validating... (loss=1.95765):  15%|| 24/157 [00:08<00:42,  3.16it/s]

Validating... (loss=1.95765):  16%|| 25/157 [00:08<00:41,  3.15it/s]

Validating... (loss=2.03503):  16%|| 25/157 [00:09<00:41,  3.15it/s]

Validating... (loss=2.03503):  17%|| 26/157 [00:09<00:41,  3.16it/s]

Validating... (loss=1.83149):  17%|| 26/157 [00:09<00:41,  3.16it/s]

Validating... (loss=1.83149):  17%|| 27/157 [00:09<00:41,  3.15it/s]

Validating... (loss=1.81582):  17%|| 27/157 [00:09<00:41,  3.15it/s]

Validating... (loss=1.81582):  18%|| 28/157 [00:09<00:41,  3.15it/s]

Validating... (loss=1.96166):  18%|| 28/157 [00:09<00:41,  3.15it/s]

Validating... (loss=1.96166):  18%|| 29/157 [00:09<00:40,  3.15it/s]

Validating... (loss=1.95516):  18%|| 29/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.95516):  19%|| 30/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.98399):  19%|| 30/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.98399):  20%|| 31/157 [00:10<00:39,  3.15it/s]

Validating... (loss=1.83291):  20%|| 31/157 [00:10<00:39,  3.15it/s]

Validating... (loss=1.83291):  20%|| 32/157 [00:10<00:39,  3.15it/s]

Validating... (loss=1.92367):  20%|| 32/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.92367):  21%|| 33/157 [00:11<00:39,  3.16it/s]

Validating... (loss=1.87666):  21%|| 33/157 [00:11<00:39,  3.16it/s]

Validating... (loss=1.87666):  22%|| 34/157 [00:11<00:38,  3.15it/s]

Validating... (loss=1.98605):  22%|| 34/157 [00:11<00:38,  3.15it/s]

Validating... (loss=1.98605):  22%|| 35/157 [00:11<00:38,  3.16it/s]

Validating... (loss=1.95438):  22%|| 35/157 [00:12<00:38,  3.16it/s]

Validating... (loss=1.95438):  23%|| 36/157 [00:12<00:38,  3.16it/s]

Validating... (loss=1.90981):  23%|| 36/157 [00:12<00:38,  3.16it/s]

Validating... (loss=1.90981):  24%|| 37/157 [00:12<00:38,  3.16it/s]

Validating... (loss=2.00019):  24%|| 37/157 [00:12<00:38,  3.16it/s]

Validating... (loss=2.00019):  24%|| 38/157 [00:12<00:37,  3.16it/s]

Validating... (loss=1.67109):  24%|| 38/157 [00:13<00:37,  3.16it/s]

Validating... (loss=1.67109):  25%|| 39/157 [00:13<00:37,  3.16it/s]

Validating... (loss=2.16869):  25%|| 39/157 [00:13<00:37,  3.16it/s]

Validating... (loss=2.16869):  25%|| 40/157 [00:13<00:37,  3.16it/s]

Validating... (loss=1.96571):  25%|| 40/157 [00:13<00:37,  3.16it/s]

Validating... (loss=1.96571):  26%|| 41/157 [00:13<00:36,  3.16it/s]

Validating... (loss=1.97993):  26%|| 41/157 [00:14<00:36,  3.16it/s]

Validating... (loss=1.97993):  27%|| 42/157 [00:14<00:36,  3.14it/s]

Validating... (loss=1.82840):  27%|| 42/157 [00:14<00:36,  3.14it/s]

Validating... (loss=1.82840):  27%|| 43/157 [00:14<00:36,  3.15it/s]

Validating... (loss=2.22448):  27%|| 43/157 [00:14<00:36,  3.15it/s]

Validating... (loss=2.22448):  28%|| 44/157 [00:14<00:35,  3.15it/s]

Validating... (loss=2.14193):  28%|| 44/157 [00:15<00:35,  3.15it/s]

Validating... (loss=2.14193):  29%|| 45/157 [00:15<00:35,  3.15it/s]

Validating... (loss=1.85215):  29%|| 45/157 [00:15<00:35,  3.15it/s]

Validating... (loss=1.85215):  29%|| 46/157 [00:15<00:35,  3.16it/s]

Validating... (loss=2.01490):  29%|| 46/157 [00:15<00:35,  3.16it/s]

Validating... (loss=2.01490):  30%|| 47/157 [00:15<00:34,  3.16it/s]

Validating... (loss=1.86206):  30%|| 47/157 [00:15<00:34,  3.16it/s]

Validating... (loss=1.86206):  31%|| 48/157 [00:15<00:34,  3.16it/s]

Validating... (loss=1.93693):  31%|| 48/157 [00:16<00:34,  3.16it/s]

Validating... (loss=1.93693):  31%|| 49/157 [00:16<00:34,  3.16it/s]

Validating... (loss=1.95430):  31%|| 49/157 [00:16<00:34,  3.16it/s]

Validating... (loss=1.95430):  32%|| 50/157 [00:16<00:33,  3.16it/s]

Validating... (loss=2.18115):  32%|| 50/157 [00:16<00:33,  3.16it/s]

Validating... (loss=2.18115):  32%|| 51/157 [00:16<00:33,  3.16it/s]

Validating... (loss=1.94852):  32%|| 51/157 [00:17<00:33,  3.16it/s]

Validating... (loss=1.94852):  33%|| 52/157 [00:17<00:33,  3.16it/s]

Validating... (loss=1.82103):  33%|| 52/157 [00:17<00:33,  3.16it/s]

Validating... (loss=1.82103):  34%|| 53/157 [00:17<00:32,  3.16it/s]

Validating... (loss=2.00727):  34%|| 53/157 [00:17<00:32,  3.16it/s]

Validating... (loss=2.00727):  34%|| 54/157 [00:17<00:32,  3.16it/s]

Validating... (loss=2.07494):  34%|| 54/157 [00:18<00:32,  3.16it/s]

Validating... (loss=2.07494):  35%|| 55/157 [00:18<00:32,  3.16it/s]

Validating... (loss=1.78494):  35%|| 55/157 [00:18<00:32,  3.16it/s]

Validating... (loss=1.78494):  36%|| 56/157 [00:18<00:31,  3.16it/s]

Validating... (loss=1.97887):  36%|| 56/157 [00:18<00:31,  3.16it/s]

Validating... (loss=1.97887):  36%|| 57/157 [00:18<00:31,  3.16it/s]

Validating... (loss=1.86379):  36%|| 57/157 [00:19<00:31,  3.16it/s]

Validating... (loss=1.86379):  37%|| 58/157 [00:19<00:31,  3.16it/s]

Validating... (loss=1.97045):  37%|| 58/157 [00:19<00:31,  3.16it/s]

Validating... (loss=1.97045):  38%|| 59/157 [00:19<00:31,  3.16it/s]

Validating... (loss=1.83471):  38%|| 59/157 [00:19<00:31,  3.16it/s]

Validating... (loss=1.83471):  38%|| 60/157 [00:19<00:30,  3.16it/s]

Validating... (loss=1.86553):  38%|| 60/157 [00:20<00:30,  3.16it/s]

Validating... (loss=1.86553):  39%|| 61/157 [00:20<00:30,  3.16it/s]

Validating... (loss=1.86748):  39%|| 61/157 [00:20<00:30,  3.16it/s]

Validating... (loss=1.86748):  39%|| 62/157 [00:20<00:30,  3.16it/s]

Validating... (loss=1.86875):  39%|| 62/157 [00:20<00:30,  3.16it/s]

Validating... (loss=1.86875):  40%|| 63/157 [00:20<00:29,  3.16it/s]

Validating... (loss=2.07251):  40%|| 63/157 [00:21<00:29,  3.16it/s]

Validating... (loss=2.07251):  41%|| 64/157 [00:21<00:29,  3.16it/s]

Validating... (loss=2.07134):  41%|| 64/157 [00:21<00:29,  3.16it/s]

Validating... (loss=2.07134):  41%|| 65/157 [00:21<00:29,  3.15it/s]

Validating... (loss=1.95412):  41%|| 65/157 [00:21<00:29,  3.15it/s]

Validating... (loss=1.95412):  42%|| 66/157 [00:21<00:28,  3.15it/s]

Validating... (loss=1.80871):  42%|| 66/157 [00:22<00:28,  3.15it/s]

Validating... (loss=1.80871):  43%|| 67/157 [00:22<00:28,  3.15it/s]

Validating... (loss=1.91320):  43%|| 67/157 [00:22<00:28,  3.15it/s]

Validating... (loss=1.91320):  43%|| 68/157 [00:22<00:28,  3.15it/s]

Validating... (loss=1.86889):  43%|| 68/157 [00:22<00:28,  3.15it/s]

Validating... (loss=1.86889):  44%|| 69/157 [00:22<00:27,  3.15it/s]

Validating... (loss=1.98033):  44%|| 69/157 [00:22<00:27,  3.15it/s]

Validating... (loss=1.98033):  45%|| 70/157 [00:22<00:27,  3.15it/s]

Validating... (loss=1.92558):  45%|| 70/157 [00:23<00:27,  3.15it/s]

Validating... (loss=1.92558):  45%|| 71/157 [00:23<00:27,  3.15it/s]

Validating... (loss=2.05491):  45%|| 71/157 [00:23<00:27,  3.15it/s]

Validating... (loss=2.05491):  46%|| 72/157 [00:23<00:26,  3.15it/s]

Validating... (loss=1.75146):  46%|| 72/157 [00:23<00:26,  3.15it/s]

Validating... (loss=1.75146):  46%|| 73/157 [00:23<00:26,  3.15it/s]

Validating... (loss=2.05393):  46%|| 73/157 [00:24<00:26,  3.15it/s]

Validating... (loss=2.05393):  47%|| 74/157 [00:24<00:26,  3.15it/s]

Validating... (loss=2.20715):  47%|| 74/157 [00:24<00:26,  3.15it/s]

Validating... (loss=2.20715):  48%|| 75/157 [00:24<00:26,  3.15it/s]

Validating... (loss=1.77159):  48%|| 75/157 [00:24<00:26,  3.15it/s]

Validating... (loss=1.77159):  48%|| 76/157 [00:24<00:25,  3.15it/s]

Validating... (loss=1.69227):  48%|| 76/157 [00:25<00:25,  3.15it/s]

Validating... (loss=1.69227):  49%|| 77/157 [00:25<00:25,  3.14it/s]

Validating... (loss=1.83988):  49%|| 77/157 [00:25<00:25,  3.14it/s]

Validating... (loss=1.83988):  50%|| 78/157 [00:25<00:25,  3.13it/s]

Validating... (loss=1.91292):  50%|| 78/157 [00:25<00:25,  3.13it/s]

Validating... (loss=1.91292):  50%|| 79/157 [00:25<00:24,  3.13it/s]

Validating... (loss=1.81812):  50%|| 79/157 [00:26<00:24,  3.13it/s]

Validating... (loss=1.81812):  51%|| 80/157 [00:26<00:24,  3.13it/s]

Validating... (loss=2.05976):  51%|| 80/157 [00:26<00:24,  3.13it/s]

Validating... (loss=2.05976):  52%|| 81/157 [00:26<00:24,  3.14it/s]

Validating... (loss=2.14698):  52%|| 81/157 [00:26<00:24,  3.14it/s]

Validating... (loss=2.14698):  52%|| 82/157 [00:26<00:23,  3.14it/s]

Validating... (loss=1.91610):  52%|| 82/157 [00:27<00:23,  3.14it/s]

Validating... (loss=1.91610):  53%|| 83/157 [00:27<00:23,  3.14it/s]

Validating... (loss=2.03839):  53%|| 83/157 [00:27<00:23,  3.14it/s]

Validating... (loss=2.03839):  54%|| 84/157 [00:27<00:23,  3.14it/s]

Validating... (loss=2.08756):  54%|| 84/157 [00:27<00:23,  3.14it/s]

Validating... (loss=2.08756):  54%|| 85/157 [00:27<00:22,  3.14it/s]

Validating... (loss=1.76365):  54%|| 85/157 [00:28<00:22,  3.14it/s]

Validating... (loss=1.76365):  55%|| 86/157 [00:28<00:22,  3.14it/s]

Validating... (loss=2.21114):  55%|| 86/157 [00:28<00:22,  3.14it/s]

Validating... (loss=2.21114):  55%|| 87/157 [00:28<00:22,  3.13it/s]

Validating... (loss=2.02859):  55%|| 87/157 [00:28<00:22,  3.13it/s]

Validating... (loss=2.02859):  56%|| 88/157 [00:28<00:21,  3.14it/s]

Validating... (loss=2.03471):  56%|| 88/157 [00:29<00:21,  3.14it/s]

Validating... (loss=2.03471):  57%|| 89/157 [00:29<00:21,  3.14it/s]

Validating... (loss=1.78402):  57%|| 89/157 [00:29<00:21,  3.14it/s]

Validating... (loss=1.78402):  57%|| 90/157 [00:29<00:21,  3.15it/s]

Validating... (loss=1.78491):  57%|| 90/157 [00:29<00:21,  3.15it/s]

Validating... (loss=1.78491):  58%|| 91/157 [00:29<00:20,  3.15it/s]

Validating... (loss=2.04612):  58%|| 91/157 [00:29<00:20,  3.15it/s]

Validating... (loss=2.04612):  59%|| 92/157 [00:29<00:20,  3.15it/s]

Validating... (loss=1.91348):  59%|| 92/157 [00:30<00:20,  3.15it/s]

Validating... (loss=1.91348):  59%|| 93/157 [00:30<00:20,  3.15it/s]

Validating... (loss=2.07681):  59%|| 93/157 [00:30<00:20,  3.15it/s]

Validating... (loss=2.07681):  60%|| 94/157 [00:30<00:19,  3.15it/s]

Validating... (loss=1.93635):  60%|| 94/157 [00:30<00:19,  3.15it/s]

Validating... (loss=1.93635):  61%|| 95/157 [00:30<00:19,  3.15it/s]

Validating... (loss=2.06564):  61%|| 95/157 [00:31<00:19,  3.15it/s]

Validating... (loss=2.06564):  61%|| 96/157 [00:31<00:19,  3.15it/s]

Validating... (loss=2.00364):  61%|| 96/157 [00:31<00:19,  3.15it/s]

Validating... (loss=2.00364):  62%|| 97/157 [00:31<00:19,  3.15it/s]

Validating... (loss=1.86344):  62%|| 97/157 [00:31<00:19,  3.15it/s]

Validating... (loss=1.86344):  62%|| 98/157 [00:31<00:18,  3.15it/s]

Validating... (loss=1.79385):  62%|| 98/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.79385):  63%|| 99/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.97139):  63%|| 99/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.97139):  64%|| 100/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.99119):  64%|| 100/157 [00:32<00:18,  3.15it/s]

Validating... (loss=1.99119):  64%|| 101/157 [00:32<00:17,  3.15it/s]

Validating... (loss=1.88409):  64%|| 101/157 [00:33<00:17,  3.15it/s]

Validating... (loss=1.88409):  65%|| 102/157 [00:33<00:17,  3.15it/s]

Validating... (loss=2.05286):  65%|| 102/157 [00:33<00:17,  3.15it/s]

Validating... (loss=2.05286):  66%|| 103/157 [00:33<00:17,  3.15it/s]

Validating... (loss=1.98885):  66%|| 103/157 [00:33<00:17,  3.15it/s]

Validating... (loss=1.98885):  66%|| 104/157 [00:33<00:16,  3.15it/s]

Validating... (loss=1.92281):  66%|| 104/157 [00:34<00:16,  3.15it/s]

Validating... (loss=1.92281):  67%|| 105/157 [00:34<00:16,  3.16it/s]

Validating... (loss=1.74435):  67%|| 105/157 [00:34<00:16,  3.16it/s]

Validating... (loss=1.74435):  68%|| 106/157 [00:34<00:16,  3.16it/s]

Validating... (loss=1.75246):  68%|| 106/157 [00:34<00:16,  3.16it/s]

Validating... (loss=1.75246):  68%|| 107/157 [00:34<00:15,  3.16it/s]

Validating... (loss=2.21803):  68%|| 107/157 [00:35<00:15,  3.16it/s]

Validating... (loss=2.21803):  69%|| 108/157 [00:35<00:15,  3.16it/s]

Validating... (loss=1.95877):  69%|| 108/157 [00:35<00:15,  3.16it/s]

Validating... (loss=1.95877):  69%|| 109/157 [00:35<00:15,  3.16it/s]

Validating... (loss=2.07463):  69%|| 109/157 [00:35<00:15,  3.16it/s]

Validating... (loss=2.07463):  70%|| 110/157 [00:35<00:14,  3.16it/s]

Validating... (loss=2.02058):  70%|| 110/157 [00:35<00:14,  3.16it/s]

Validating... (loss=2.02058):  71%|| 111/157 [00:35<00:14,  3.16it/s]

Validating... (loss=1.82865):  71%|| 111/157 [00:36<00:14,  3.16it/s]

Validating... (loss=1.82865):  71%|| 112/157 [00:36<00:14,  3.16it/s]

Validating... (loss=2.08676):  71%|| 112/157 [00:36<00:14,  3.16it/s]

Validating... (loss=2.08676):  72%|| 113/157 [00:36<00:13,  3.15it/s]

Validating... (loss=1.72579):  72%|| 113/157 [00:36<00:13,  3.15it/s]

Validating... (loss=1.72579):  73%|| 114/157 [00:36<00:13,  3.14it/s]

Validating... (loss=1.89924):  73%|| 114/157 [00:37<00:13,  3.14it/s]

Validating... (loss=1.89924):  73%|| 115/157 [00:37<00:13,  3.15it/s]

Validating... (loss=2.07517):  73%|| 115/157 [00:37<00:13,  3.15it/s]

Validating... (loss=2.07517):  74%|| 116/157 [00:37<00:13,  3.15it/s]

Validating... (loss=1.69738):  74%|| 116/157 [00:37<00:13,  3.15it/s]

Validating... (loss=1.69738):  75%|| 117/157 [00:37<00:12,  3.15it/s]

Validating... (loss=1.93370):  75%|| 117/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.93370):  75%|| 118/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.70288):  75%|| 118/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.70288):  76%|| 119/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.78532):  76%|| 119/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.78532):  76%|| 120/157 [00:38<00:11,  3.14it/s]

Validating... (loss=1.98080):  76%|| 120/157 [00:39<00:11,  3.14it/s]

Validating... (loss=1.98080):  77%|| 121/157 [00:39<00:11,  3.14it/s]

Validating... (loss=1.95473):  77%|| 121/157 [00:39<00:11,  3.14it/s]

Validating... (loss=1.95473):  78%|| 122/157 [00:39<00:11,  3.15it/s]

Validating... (loss=2.14998):  78%|| 122/157 [00:39<00:11,  3.15it/s]

Validating... (loss=2.14998):  78%|| 123/157 [00:39<00:10,  3.15it/s]

Validating... (loss=1.87073):  78%|| 123/157 [00:40<00:10,  3.15it/s]

Validating... (loss=1.87073):  79%|| 124/157 [00:40<00:10,  3.15it/s]

Validating... (loss=2.05978):  79%|| 124/157 [00:40<00:10,  3.15it/s]

Validating... (loss=2.05978):  80%|| 125/157 [00:40<00:10,  3.15it/s]

Validating... (loss=1.93566):  80%|| 125/157 [00:40<00:10,  3.15it/s]

Validating... (loss=1.93566):  80%|| 126/157 [00:40<00:09,  3.15it/s]

Validating... (loss=1.98957):  80%|| 126/157 [00:41<00:09,  3.15it/s]

Validating... (loss=1.98957):  81%|| 127/157 [00:41<00:09,  3.15it/s]

Validating... (loss=1.83391):  81%|| 127/157 [00:41<00:09,  3.15it/s]

Validating... (loss=1.83391):  82%|| 128/157 [00:41<00:09,  3.15it/s]

Validating... (loss=2.07211):  82%|| 128/157 [00:41<00:09,  3.15it/s]

Validating... (loss=2.07211):  82%|| 129/157 [00:41<00:08,  3.15it/s]

Validating... (loss=2.17540):  82%|| 129/157 [00:42<00:08,  3.15it/s]

Validating... (loss=2.17540):  83%|| 130/157 [00:42<00:08,  3.15it/s]

Validating... (loss=1.98429):  83%|| 130/157 [00:42<00:08,  3.15it/s]

Validating... (loss=1.98429):  83%|| 131/157 [00:42<00:08,  3.16it/s]

Validating... (loss=1.86671):  83%|| 131/157 [00:42<00:08,  3.16it/s]

Validating... (loss=1.86671):  84%|| 132/157 [00:42<00:07,  3.15it/s]

Validating... (loss=2.02329):  84%|| 132/157 [00:42<00:07,  3.15it/s]

Validating... (loss=2.02329):  85%|| 133/157 [00:42<00:07,  3.15it/s]

Validating... (loss=1.94126):  85%|| 133/157 [00:43<00:07,  3.15it/s]

Validating... (loss=1.94126):  85%|| 134/157 [00:43<00:07,  3.15it/s]

Validating... (loss=1.90406):  85%|| 134/157 [00:43<00:07,  3.15it/s]

Validating... (loss=1.90406):  86%|| 135/157 [00:43<00:07,  3.14it/s]

Validating... (loss=1.99828):  86%|| 135/157 [00:43<00:07,  3.14it/s]

Validating... (loss=1.99828):  87%|| 136/157 [00:43<00:06,  3.15it/s]

Validating... (loss=1.88400):  87%|| 136/157 [00:44<00:06,  3.15it/s]

Validating... (loss=1.88400):  87%|| 137/157 [00:44<00:06,  3.15it/s]

Validating... (loss=1.99979):  87%|| 137/157 [00:44<00:06,  3.15it/s]

Validating... (loss=1.99979):  88%|| 138/157 [00:44<00:06,  3.15it/s]

Validating... (loss=1.97088):  88%|| 138/157 [00:44<00:06,  3.15it/s]

Validating... (loss=1.97088):  89%|| 139/157 [00:44<00:05,  3.15it/s]

Validating... (loss=1.86133):  89%|| 139/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.86133):  89%|| 140/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.85192):  89%|| 140/157 [00:45<00:05,  3.15it/s]

Validating... (loss=1.85192):  90%|| 141/157 [00:45<00:05,  3.15it/s]

Validating... (loss=2.00094):  90%|| 141/157 [00:45<00:05,  3.15it/s]

Validating... (loss=2.00094):  90%|| 142/157 [00:45<00:04,  3.15it/s]

Validating... (loss=2.05411):  90%|| 142/157 [00:46<00:04,  3.15it/s]

Validating... (loss=2.05411):  91%|| 143/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.80450):  91%|| 143/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.80450):  92%|| 144/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.64343):  92%|| 144/157 [00:46<00:04,  3.15it/s]

Validating... (loss=1.64343):  92%|| 145/157 [00:46<00:03,  3.15it/s]

Validating... (loss=2.03843):  92%|| 145/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.03843):  93%|| 146/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.01137):  93%|| 146/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.01137):  94%|| 147/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.01315):  94%|| 147/157 [00:47<00:03,  3.15it/s]

Validating... (loss=2.01315):  94%|| 148/157 [00:47<00:02,  3.15it/s]

Validating... (loss=1.95237):  94%|| 148/157 [00:48<00:02,  3.15it/s]

Validating... (loss=1.95237):  95%|| 149/157 [00:48<00:02,  3.15it/s]

Validating... (loss=1.90103):  95%|| 149/157 [00:48<00:02,  3.15it/s]

Validating... (loss=1.90103):  96%|| 150/157 [00:48<00:02,  3.15it/s]

Validating... (loss=1.68364):  96%|| 150/157 [00:48<00:02,  3.15it/s]

Validating... (loss=1.68364):  96%|| 151/157 [00:48<00:01,  3.15it/s]

Validating... (loss=1.93524):  96%|| 151/157 [00:49<00:01,  3.15it/s]

Validating... (loss=1.93524):  97%|| 152/157 [00:49<00:01,  3.15it/s]

Validating... (loss=2.01033):  97%|| 152/157 [00:49<00:01,  3.15it/s]

Validating... (loss=2.01033):  97%|| 153/157 [00:49<00:01,  3.15it/s]

Validating... (loss=1.89459):  97%|| 153/157 [00:49<00:01,  3.15it/s]

Validating... (loss=1.89459):  98%|| 154/157 [00:49<00:00,  3.15it/s]

Validating... (loss=1.95326):  98%|| 154/157 [00:49<00:00,  3.15it/s]

Validating... (loss=1.95326):  99%|| 155/157 [00:49<00:00,  3.15it/s]

Validating... (loss=2.06238):  99%|| 155/157 [00:50<00:00,  3.15it/s]

Validating... (loss=2.06238):  99%|| 156/157 [00:50<00:00,  3.15it/s]

Validating... (loss=1.73776):  99%|| 156/157 [00:50<00:00,  3.15it/s]
Validating... (loss=1.73776): 100%|| 157/157 [00:50<00:00,  3.10it/s]
10/17/2022 06:30:35 - INFO - __main__ - 

10/17/2022 06:30:35 - INFO - __main__ - Validation Results
10/17/2022 06:30:35 - INFO - __main__ - Global Steps: 400
10/17/2022 06:30:35 - INFO - __main__ - Valid Loss: 1.93989
10/17/2022 06:30:35 - INFO - __main__ - Valid Accuracy: 0.29560
10/17/2022 06:30:36 - INFO - __main__ - Saved model checkpoint to [DIR: output]

Training (400 / 500 Steps) (loss=2.14746):  51%|| 400/782 [10:37<1:44:34, 16.43s/it]
Training (401 / 500 Steps) (loss=2.07481):  51%|| 400/782 [10:38<1:44:34, 16.43s/it]
Training (401 / 500 Steps) (loss=2.07481):  51%|| 401/782 [10:38<1:15:02, 11.82s/it]
Training (402 / 500 Steps) (loss=2.08246):  51%|| 401/782 [10:39<1:15:02, 11.82s/it]
Training (402 / 500 Steps) (loss=2.08246):  51%|| 402/782 [10:39<54:23,  8.59s/it]  
Training (403 / 500 Steps) (loss=2.09019):  51%|| 402/782 [10:40<54:23,  8.59s/it]
Training (403 / 500 Steps) (loss=2.09019):  52%|| 403/782 [10:40<39:58,  6.33s/it]
Training (404 / 500 Steps) (loss=2.04720):  52%|| 403/782 [10:41<39:58,  6.33s/it]
Training (404 / 500 Steps) (loss=2.04720):  52%|| 404/782 [10:41<29:54,  4.75s/it]
Training (405 / 500 Steps) (loss=1.92600):  52%|| 404/782 [10:42<29:54,  4.75s/it]
Training (405 / 500 Steps) (loss=1.92600):  52%|| 405/782 [10:42<22:52,  3.64s/it]
Training (406 / 500 Steps) (loss=2.08511):  52%|| 405/782 [10:43<22:52,  3.64s/it]
Training (406 / 500 Steps) (loss=2.08511):  52%|| 406/782 [10:43<17:57,  2.87s/it]
Training (407 / 500 Steps) (loss=2.05937):  52%|| 406/782 [10:44<17:57,  2.87s/it]
Training (407 / 500 Steps) (loss=2.05937):  52%|| 407/782 [10:44<14:31,  2.32s/it]
Training (408 / 500 Steps) (loss=2.17554):  52%|| 407/782 [10:45<14:31,  2.32s/it]
Training (408 / 500 Steps) (loss=2.17554):  52%|| 408/782 [10:45<12:07,  1.94s/it]
Training (409 / 500 Steps) (loss=1.78055):  52%|| 408/782 [10:46<12:07,  1.94s/it]
Training (409 / 500 Steps) (loss=1.78055):  52%|| 409/782 [10:46<10:26,  1.68s/it]
Training (410 / 500 Steps) (loss=2.05046):  52%|| 409/782 [10:47<10:26,  1.68s/it]
Training (410 / 500 Steps) (loss=2.05046):  52%|| 410/782 [10:47<09:14,  1.49s/it]
Training (411 / 500 Steps) (loss=2.00198):  52%|| 410/782 [10:48<09:14,  1.49s/it]
Training (411 / 500 Steps) (loss=2.00198):  53%|| 411/782 [10:48<08:25,  1.36s/it]
Training (412 / 500 Steps) (loss=2.01992):  53%|| 411/782 [10:49<08:25,  1.36s/it]
Training (412 / 500 Steps) (loss=2.01992):  53%|| 412/782 [10:49<07:49,  1.27s/it]
Training (413 / 500 Steps) (loss=1.90704):  53%|| 412/782 [10:50<07:49,  1.27s/it]
Training (413 / 500 Steps) (loss=1.90704):  53%|| 413/782 [10:50<07:24,  1.21s/it]
Training (414 / 500 Steps) (loss=1.97303):  53%|| 413/782 [10:52<07:24,  1.21s/it]
Training (414 / 500 Steps) (loss=1.97303):  53%|| 414/782 [10:52<07:06,  1.16s/it]
Training (415 / 500 Steps) (loss=2.00023):  53%|| 414/782 [10:53<07:06,  1.16s/it]
Training (415 / 500 Steps) (loss=2.00023):  53%|| 415/782 [10:53<06:55,  1.13s/it]
Training (416 / 500 Steps) (loss=2.01998):  53%|| 415/782 [10:54<06:55,  1.13s/it]
Training (416 / 500 Steps) (loss=2.01998):  53%|| 416/782 [10:54<06:46,  1.11s/it]
Training (417 / 500 Steps) (loss=1.73923):  53%|| 416/782 [10:55<06:46,  1.11s/it]
Training (417 / 500 Steps) (loss=1.73923):  53%|| 417/782 [10:55<06:39,  1.09s/it]
Training (418 / 500 Steps) (loss=1.91216):  53%|| 417/782 [10:56<06:39,  1.09s/it]
Training (418 / 500 Steps) (loss=1.91216):  53%|| 418/782 [10:56<06:33,  1.08s/it]
Training (419 / 500 Steps) (loss=2.10269):  53%|| 418/782 [10:57<06:33,  1.08s/it]
Training (419 / 500 Steps) (loss=2.10269):  54%|| 419/782 [10:57<06:29,  1.07s/it]
Training (420 / 500 Steps) (loss=2.00440):  54%|| 419/782 [10:58<06:29,  1.07s/it]
Training (420 / 500 Steps) (loss=2.00440):  54%|| 420/782 [10:58<06:27,  1.07s/it]
Training (421 / 500 Steps) (loss=2.18075):  54%|| 420/782 [10:59<06:27,  1.07s/it]
Training (421 / 500 Steps) (loss=2.18075):  54%|| 421/782 [10:59<06:24,  1.07s/it]
Training (422 / 500 Steps) (loss=1.93125):  54%|| 421/782 [11:00<06:24,  1.07s/it]
Training (422 / 500 Steps) (loss=1.93125):  54%|| 422/782 [11:00<06:22,  1.06s/it]
Training (423 / 500 Steps) (loss=2.01016):  54%|| 422/782 [11:01<06:22,  1.06s/it]
Training (423 / 500 Steps) (loss=2.01016):  54%|| 423/782 [11:01<06:20,  1.06s/it]
Training (424 / 500 Steps) (loss=1.94640):  54%|| 423/782 [11:02<06:20,  1.06s/it]
Training (424 / 500 Steps) (loss=1.94640):  54%|| 424/782 [11:02<06:18,  1.06s/it]
Training (425 / 500 Steps) (loss=1.94451):  54%|| 424/782 [11:03<06:18,  1.06s/it]
Training (425 / 500 Steps) (loss=1.94451):  54%|| 425/782 [11:03<06:17,  1.06s/it]
Training (426 / 500 Steps) (loss=1.84611):  54%|| 425/782 [11:04<06:17,  1.06s/it]
Training (426 / 500 Steps) (loss=1.84611):  54%|| 426/782 [11:04<06:15,  1.06s/it]
Training (427 / 500 Steps) (loss=2.10106):  54%|| 426/782 [11:05<06:15,  1.06s/it]
Training (427 / 500 Steps) (loss=2.10106):  55%|| 427/782 [11:05<06:15,  1.06s/it]
Training (428 / 500 Steps) (loss=2.10762):  55%|| 427/782 [11:06<06:15,  1.06s/it]
Training (428 / 500 Steps) (loss=2.10762):  55%|| 428/782 [11:06<06:13,  1.06s/it]
Training (429 / 500 Steps) (loss=1.96788):  55%|| 428/782 [11:07<06:13,  1.06s/it]
Training (429 / 500 Steps) (loss=1.96788):  55%|| 429/782 [11:07<06:12,  1.06s/it]
Training (430 / 500 Steps) (loss=1.79993):  55%|| 429/782 [11:08<06:12,  1.06s/it]
Training (430 / 500 Steps) (loss=1.79993):  55%|| 430/782 [11:08<06:11,  1.06s/it]
Training (431 / 500 Steps) (loss=2.01083):  55%|| 430/782 [11:09<06:11,  1.06s/it]
Training (431 / 500 Steps) (loss=2.01083):  55%|| 431/782 [11:09<06:10,  1.05s/it]
Training (432 / 500 Steps) (loss=2.11921):  55%|| 431/782 [11:11<06:10,  1.05s/it]
Training (432 / 500 Steps) (loss=2.11921):  55%|| 432/782 [11:11<06:09,  1.05s/it]
Training (433 / 500 Steps) (loss=1.90721):  55%|| 432/782 [11:12<06:09,  1.05s/it]
Training (433 / 500 Steps) (loss=1.90721):  55%|| 433/782 [11:12<06:09,  1.06s/it]
Training (434 / 500 Steps) (loss=1.74145):  55%|| 433/782 [11:13<06:09,  1.06s/it]
Training (434 / 500 Steps) (loss=1.74145):  55%|| 434/782 [11:13<06:08,  1.06s/it]
Training (435 / 500 Steps) (loss=2.00070):  55%|| 434/782 [11:14<06:08,  1.06s/it]
Training (435 / 500 Steps) (loss=2.00070):  56%|| 435/782 [11:14<06:07,  1.06s/it]
Training (436 / 500 Steps) (loss=1.69819):  56%|| 435/782 [11:15<06:07,  1.06s/it]
Training (436 / 500 Steps) (loss=1.69819):  56%|| 436/782 [11:15<06:05,  1.06s/it]
Training (437 / 500 Steps) (loss=1.89041):  56%|| 436/782 [11:16<06:05,  1.06s/it]
Training (437 / 500 Steps) (loss=1.89041):  56%|| 437/782 [11:16<06:05,  1.06s/it]
Training (438 / 500 Steps) (loss=1.91900):  56%|| 437/782 [11:17<06:05,  1.06s/it]
Training (438 / 500 Steps) (loss=1.91900):  56%|| 438/782 [11:17<06:03,  1.06s/it]
Training (439 / 500 Steps) (loss=1.96310):  56%|| 438/782 [11:18<06:03,  1.06s/it]
Training (439 / 500 Steps) (loss=1.96310):  56%|| 439/782 [11:18<06:02,  1.06s/it]
Training (440 / 500 Steps) (loss=1.78732):  56%|| 439/782 [11:19<06:02,  1.06s/it]
Training (440 / 500 Steps) (loss=1.78732):  56%|| 440/782 [11:19<06:01,  1.06s/it]
Training (441 / 500 Steps) (loss=1.94765):  56%|| 440/782 [11:20<06:01,  1.06s/it]
Training (441 / 500 Steps) (loss=1.94765):  56%|| 441/782 [11:20<06:00,  1.06s/it]
Training (442 / 500 Steps) (loss=2.08553):  56%|| 441/782 [11:21<06:00,  1.06s/it]
Training (442 / 500 Steps) (loss=2.08553):  57%|| 442/782 [11:21<05:59,  1.06s/it]
Training (443 / 500 Steps) (loss=2.19231):  57%|| 442/782 [11:22<05:59,  1.06s/it]
Training (443 / 500 Steps) (loss=2.19231):  57%|| 443/782 [11:22<05:57,  1.06s/it]
Training (444 / 500 Steps) (loss=2.09570):  57%|| 443/782 [11:23<05:57,  1.06s/it]
Training (444 / 500 Steps) (loss=2.09570):  57%|| 444/782 [11:23<05:56,  1.05s/it]
Training (445 / 500 Steps) (loss=1.84925):  57%|| 444/782 [11:24<05:56,  1.05s/it]
Training (445 / 500 Steps) (loss=1.84925):  57%|| 445/782 [11:24<05:54,  1.05s/it]
Training (446 / 500 Steps) (loss=1.90460):  57%|| 445/782 [11:25<05:54,  1.05s/it]
Training (446 / 500 Steps) (loss=1.90460):  57%|| 446/782 [11:25<05:54,  1.06s/it]
Training (447 / 500 Steps) (loss=2.09122):  57%|| 446/782 [11:26<05:54,  1.06s/it]
Training (447 / 500 Steps) (loss=2.09122):  57%|| 447/782 [11:26<05:53,  1.06s/it]
Training (448 / 500 Steps) (loss=2.01761):  57%|| 447/782 [11:27<05:53,  1.06s/it]
Training (448 / 500 Steps) (loss=2.01761):  57%|| 448/782 [11:27<05:52,  1.06s/it]
Training (449 / 500 Steps) (loss=2.07274):  57%|| 448/782 [11:28<05:52,  1.06s/it]
Training (449 / 500 Steps) (loss=2.07274):  57%|| 449/782 [11:28<05:51,  1.06s/it]
Training (450 / 500 Steps) (loss=2.06623):  57%|| 449/782 [11:30<05:51,  1.06s/it]
Training (450 / 500 Steps) (loss=2.06623):  58%|| 450/782 [11:30<05:50,  1.06s/it]
Training (451 / 500 Steps) (loss=1.75546):  58%|| 450/782 [11:31<05:50,  1.06s/it]
Training (451 / 500 Steps) (loss=1.75546):  58%|| 451/782 [11:31<05:49,  1.06s/it]
Training (452 / 500 Steps) (loss=1.90589):  58%|| 451/782 [11:32<05:49,  1.06s/it]
Training (452 / 500 Steps) (loss=1.90589):  58%|| 452/782 [11:32<05:48,  1.06s/it]
Training (453 / 500 Steps) (loss=2.12352):  58%|| 452/782 [11:33<05:48,  1.06s/it]
Training (453 / 500 Steps) (loss=2.12352):  58%|| 453/782 [11:33<05:47,  1.06s/it]
Training (454 / 500 Steps) (loss=2.21609):  58%|| 453/782 [11:34<05:47,  1.06s/it]
Training (454 / 500 Steps) (loss=2.21609):  58%|| 454/782 [11:34<05:46,  1.06s/it]
Training (455 / 500 Steps) (loss=1.97395):  58%|| 454/782 [11:35<05:46,  1.06s/it]
Training (455 / 500 Steps) (loss=1.97395):  58%|| 455/782 [11:35<05:45,  1.06s/it]
Training (456 / 500 Steps) (loss=1.77837):  58%|| 455/782 [11:36<05:45,  1.06s/it]
Training (456 / 500 Steps) (loss=1.77837):  58%|| 456/782 [11:36<05:43,  1.05s/it]
Training (457 / 500 Steps) (loss=1.97226):  58%|| 456/782 [11:37<05:43,  1.05s/it]
Training (457 / 500 Steps) (loss=1.97226):  58%|| 457/782 [11:37<05:42,  1.05s/it]
Training (458 / 500 Steps) (loss=1.93457):  58%|| 457/782 [11:38<05:42,  1.05s/it]
Training (458 / 500 Steps) (loss=1.93457):  59%|| 458/782 [11:38<05:41,  1.06s/it]
Training (459 / 500 Steps) (loss=1.92894):  59%|| 458/782 [11:39<05:41,  1.06s/it]
Training (459 / 500 Steps) (loss=1.92894):  59%|| 459/782 [11:39<05:40,  1.05s/it]
Training (460 / 500 Steps) (loss=1.92874):  59%|| 459/782 [11:40<05:40,  1.05s/it]
Training (460 / 500 Steps) (loss=1.92874):  59%|| 460/782 [11:40<05:39,  1.05s/it]
Training (461 / 500 Steps) (loss=1.82403):  59%|| 460/782 [11:41<05:39,  1.05s/it]
Training (461 / 500 Steps) (loss=1.82403):  59%|| 461/782 [11:41<05:38,  1.06s/it]
Training (462 / 500 Steps) (loss=2.18867):  59%|| 461/782 [11:42<05:38,  1.06s/it]
Training (462 / 500 Steps) (loss=2.18867):  59%|| 462/782 [11:42<05:37,  1.05s/it]
Training (463 / 500 Steps) (loss=1.91051):  59%|| 462/782 [11:43<05:37,  1.05s/it]
Training (463 / 500 Steps) (loss=1.91051):  59%|| 463/782 [11:43<05:35,  1.05s/it]
Training (464 / 500 Steps) (loss=1.96750):  59%|| 463/782 [11:44<05:35,  1.05s/it]
Training (464 / 500 Steps) (loss=1.96750):  59%|| 464/782 [11:44<05:35,  1.05s/it]
Training (465 / 500 Steps) (loss=1.99777):  59%|| 464/782 [11:45<05:35,  1.05s/it]
Training (465 / 500 Steps) (loss=1.99777):  59%|| 465/782 [11:45<05:34,  1.05s/it]
Training (466 / 500 Steps) (loss=2.12820):  59%|| 465/782 [11:46<05:34,  1.05s/it]
Training (466 / 500 Steps) (loss=2.12820):  60%|| 466/782 [11:46<05:32,  1.05s/it]
Training (467 / 500 Steps) (loss=1.80437):  60%|| 466/782 [11:47<05:32,  1.05s/it]
Training (467 / 500 Steps) (loss=1.80437):  60%|| 467/782 [11:47<05:31,  1.05s/it]
Training (468 / 500 Steps) (loss=2.13246):  60%|| 467/782 [11:49<05:31,  1.05s/it]
Training (468 / 500 Steps) (loss=2.13246):  60%|| 468/782 [11:49<05:31,  1.05s/it]
Training (469 / 500 Steps) (loss=2.03642):  60%|| 468/782 [11:50<05:31,  1.05s/it]
Training (469 / 500 Steps) (loss=2.03642):  60%|| 469/782 [11:50<05:29,  1.05s/it]
Training (470 / 500 Steps) (loss=1.88274):  60%|| 469/782 [11:51<05:29,  1.05s/it]
Training (470 / 500 Steps) (loss=1.88274):  60%|| 470/782 [11:51<05:28,  1.05s/it]
Training (471 / 500 Steps) (loss=2.01475):  60%|| 470/782 [11:52<05:28,  1.05s/it]
Training (471 / 500 Steps) (loss=2.01475):  60%|| 471/782 [11:52<05:28,  1.06s/it]
Training (472 / 500 Steps) (loss=1.83348):  60%|| 471/782 [11:53<05:28,  1.06s/it]
Training (472 / 500 Steps) (loss=1.83348):  60%|| 472/782 [11:53<05:27,  1.06s/it]
Training (473 / 500 Steps) (loss=1.85080):  60%|| 472/782 [11:54<05:27,  1.06s/it]
Training (473 / 500 Steps) (loss=1.85080):  60%|| 473/782 [11:54<05:25,  1.05s/it]
Training (474 / 500 Steps) (loss=2.14311):  60%|| 473/782 [11:55<05:25,  1.05s/it]
Training (474 / 500 Steps) (loss=2.14311):  61%|| 474/782 [11:55<05:25,  1.06s/it]
Training (475 / 500 Steps) (loss=2.07638):  61%|| 474/782 [11:56<05:25,  1.06s/it]
Training (475 / 500 Steps) (loss=2.07638):  61%|| 475/782 [11:56<05:24,  1.06s/it]
Training (476 / 500 Steps) (loss=2.05625):  61%|| 475/782 [11:57<05:24,  1.06s/it]
Training (476 / 500 Steps) (loss=2.05625):  61%|| 476/782 [11:57<05:22,  1.05s/it]
Training (477 / 500 Steps) (loss=1.84486):  61%|| 476/782 [11:58<05:22,  1.05s/it]
Training (477 / 500 Steps) (loss=1.84486):  61%|| 477/782 [11:58<05:21,  1.05s/it]
Training (478 / 500 Steps) (loss=2.23410):  61%|| 477/782 [11:59<05:21,  1.05s/it]
Training (478 / 500 Steps) (loss=2.23410):  61%|| 478/782 [11:59<05:19,  1.05s/it]
Training (479 / 500 Steps) (loss=2.09925):  61%|| 478/782 [12:00<05:19,  1.05s/it]
Training (479 / 500 Steps) (loss=2.09925):  61%|| 479/782 [12:00<05:18,  1.05s/it]
Training (480 / 500 Steps) (loss=2.00667):  61%|| 479/782 [12:01<05:18,  1.05s/it]
Training (480 / 500 Steps) (loss=2.00667):  61%|| 480/782 [12:01<05:18,  1.05s/it]
Training (481 / 500 Steps) (loss=2.22903):  61%|| 480/782 [12:02<05:18,  1.05s/it]
Training (481 / 500 Steps) (loss=2.22903):  62%|| 481/782 [12:02<05:17,  1.05s/it]
Training (482 / 500 Steps) (loss=2.18467):  62%|| 481/782 [12:03<05:17,  1.05s/it]
Training (482 / 500 Steps) (loss=2.18467):  62%|| 482/782 [12:03<05:15,  1.05s/it]
Training (483 / 500 Steps) (loss=1.87782):  62%|| 482/782 [12:04<05:15,  1.05s/it]
Training (483 / 500 Steps) (loss=1.87782):  62%|| 483/782 [12:04<05:15,  1.05s/it]
Training (484 / 500 Steps) (loss=2.32692):  62%|| 483/782 [12:05<05:15,  1.05s/it]
Training (484 / 500 Steps) (loss=2.32692):  62%|| 484/782 [12:05<05:13,  1.05s/it]
Training (485 / 500 Steps) (loss=1.91695):  62%|| 484/782 [12:06<05:13,  1.05s/it]
Training (485 / 500 Steps) (loss=1.91695):  62%|| 485/782 [12:06<05:13,  1.05s/it]
Training (486 / 500 Steps) (loss=1.93203):  62%|| 485/782 [12:07<05:13,  1.05s/it]
Training (486 / 500 Steps) (loss=1.93203):  62%|| 486/782 [12:07<05:11,  1.05s/it]
Training (487 / 500 Steps) (loss=2.10782):  62%|| 486/782 [12:09<05:11,  1.05s/it]
Training (487 / 500 Steps) (loss=2.10782):  62%|| 487/782 [12:09<05:11,  1.05s/it]
Training (488 / 500 Steps) (loss=1.85693):  62%|| 487/782 [12:10<05:11,  1.05s/it]
Training (488 / 500 Steps) (loss=1.85693):  62%|| 488/782 [12:10<05:09,  1.05s/it]
Training (489 / 500 Steps) (loss=1.98388):  62%|| 488/782 [12:11<05:09,  1.05s/it]
Training (489 / 500 Steps) (loss=1.98388):  63%|| 489/782 [12:11<05:09,  1.06s/it]
Training (490 / 500 Steps) (loss=1.94408):  63%|| 489/782 [12:12<05:09,  1.06s/it]
Training (490 / 500 Steps) (loss=1.94408):  63%|| 490/782 [12:12<05:07,  1.05s/it]
Training (491 / 500 Steps) (loss=1.95866):  63%|| 490/782 [12:13<05:07,  1.05s/it]
Training (491 / 500 Steps) (loss=1.95866):  63%|| 491/782 [12:13<05:06,  1.05s/it]
Training (492 / 500 Steps) (loss=2.00873):  63%|| 491/782 [12:14<05:06,  1.05s/it]
Training (492 / 500 Steps) (loss=2.00873):  63%|| 492/782 [12:14<05:05,  1.05s/it]
Training (493 / 500 Steps) (loss=1.88733):  63%|| 492/782 [12:15<05:05,  1.05s/it]
Training (493 / 500 Steps) (loss=1.88733):  63%|| 493/782 [12:15<05:04,  1.05s/it]
Training (494 / 500 Steps) (loss=1.99425):  63%|| 493/782 [12:16<05:04,  1.05s/it]
Training (494 / 500 Steps) (loss=1.99425):  63%|| 494/782 [12:16<05:03,  1.05s/it]
Training (495 / 500 Steps) (loss=2.13569):  63%|| 494/782 [12:17<05:03,  1.05s/it]
Training (495 / 500 Steps) (loss=2.13569):  63%|| 495/782 [12:17<05:02,  1.05s/it]
Training (496 / 500 Steps) (loss=2.01947):  63%|| 495/782 [12:18<05:02,  1.05s/it]
Training (496 / 500 Steps) (loss=2.01947):  63%|| 496/782 [12:18<05:01,  1.05s/it]
Training (497 / 500 Steps) (loss=2.13278):  63%|| 496/782 [12:19<05:01,  1.05s/it]
Training (497 / 500 Steps) (loss=2.13278):  64%|| 497/782 [12:19<05:00,  1.05s/it]
Training (498 / 500 Steps) (loss=2.05048):  64%|| 497/782 [12:20<05:00,  1.05s/it]
Training (498 / 500 Steps) (loss=2.05048):  64%|| 498/782 [12:20<04:59,  1.05s/it]
Training (499 / 500 Steps) (loss=2.27572):  64%|| 498/782 [12:21<04:59,  1.05s/it]
Training (499 / 500 Steps) (loss=2.27572):  64%|| 499/782 [12:21<04:58,  1.06s/it]
Training (500 / 500 Steps) (loss=2.05132):  64%|| 499/782 [12:22<04:58,  1.06s/it]10/17/2022 06:32:22 - INFO - __main__ - ***** Running Validation *****
10/17/2022 06:32:22 - INFO - __main__ -   Num steps = 157
10/17/2022 06:32:22 - INFO - __main__ -   Batch size = 64


Validating... (loss=X.X):   0%|| 0/157 [00:00<?, ?it/s]

Validating... (loss=2.05829):   0%|| 0/157 [00:00<?, ?it/s]

Validating... (loss=2.05829):   1%|| 1/157 [00:00<02:26,  1.07it/s]

Validating... (loss=1.95516):   1%|| 1/157 [00:01<02:26,  1.07it/s]

Validating... (loss=1.95516):   1%|| 2/157 [00:01<01:28,  1.75it/s]

Validating... (loss=2.19695):   1%|| 2/157 [00:01<01:28,  1.75it/s]

Validating... (loss=2.19695):   2%|| 3/157 [00:01<01:10,  2.20it/s]

Validating... (loss=2.17061):   2%|| 3/157 [00:01<01:10,  2.20it/s]

Validating... (loss=2.17061):   3%|| 4/157 [00:01<01:01,  2.50it/s]

Validating... (loss=1.88649):   3%|| 4/157 [00:02<01:01,  2.50it/s]

Validating... (loss=1.88649):   3%|| 5/157 [00:02<00:56,  2.70it/s]

Validating... (loss=2.25734):   3%|| 5/157 [00:02<00:56,  2.70it/s]

Validating... (loss=2.25734):   4%|| 6/157 [00:02<00:53,  2.82it/s]

Validating... (loss=2.11230):   4%|| 6/157 [00:02<00:53,  2.82it/s]

Validating... (loss=2.11230):   4%|| 7/157 [00:02<00:51,  2.92it/s]

Validating... (loss=2.15861):   4%|| 7/157 [00:03<00:51,  2.92it/s]

Validating... (loss=2.15861):   5%|| 8/157 [00:03<00:49,  2.99it/s]

Validating... (loss=1.96275):   5%|| 8/157 [00:03<00:49,  2.99it/s]

Validating... (loss=1.96275):   6%|| 9/157 [00:03<00:48,  3.04it/s]

Validating... (loss=2.10142):   6%|| 9/157 [00:03<00:48,  3.04it/s]

Validating... (loss=2.10142):   6%|| 10/157 [00:03<00:47,  3.07it/s]

Validating... (loss=2.36442):   6%|| 10/157 [00:04<00:47,  3.07it/s]

Validating... (loss=2.36442):   7%|| 11/157 [00:04<00:47,  3.10it/s]

Validating... (loss=2.17348):   7%|| 11/157 [00:04<00:47,  3.10it/s]

Validating... (loss=2.17348):   8%|| 12/157 [00:04<00:46,  3.12it/s]

Validating... (loss=1.91198):   8%|| 12/157 [00:04<00:46,  3.12it/s]

Validating... (loss=1.91198):   8%|| 13/157 [00:04<00:45,  3.13it/s]

Validating... (loss=1.75191):   8%|| 13/157 [00:05<00:45,  3.13it/s]

Validating... (loss=1.75191):   9%|| 14/157 [00:05<00:45,  3.14it/s]

Validating... (loss=1.98983):   9%|| 14/157 [00:05<00:45,  3.14it/s]

Validating... (loss=1.98983):  10%|| 15/157 [00:05<00:45,  3.14it/s]

Validating... (loss=1.98742):  10%|| 15/157 [00:05<00:45,  3.14it/s]

Validating... (loss=1.98742):  10%|| 16/157 [00:05<00:44,  3.15it/s]

Validating... (loss=2.03043):  10%|| 16/157 [00:06<00:44,  3.15it/s]

Validating... (loss=2.03043):  11%|| 17/157 [00:06<00:44,  3.15it/s]

Validating... (loss=1.93742):  11%|| 17/157 [00:06<00:44,  3.15it/s]

Validating... (loss=1.93742):  11%|| 18/157 [00:06<00:44,  3.15it/s]

Validating... (loss=2.05808):  11%|| 18/157 [00:06<00:44,  3.15it/s]

Validating... (loss=2.05808):  12%|| 19/157 [00:06<00:43,  3.15it/s]

Validating... (loss=2.17679):  12%|| 19/157 [00:06<00:43,  3.15it/s]

Validating... (loss=2.17679):  13%|| 20/157 [00:06<00:43,  3.15it/s]

Validating... (loss=2.03589):  13%|| 20/157 [00:07<00:43,  3.15it/s]

Validating... (loss=2.03589):  13%|| 21/157 [00:07<00:43,  3.15it/s]

Validating... (loss=2.00082):  13%|| 21/157 [00:07<00:43,  3.15it/s]

Validating... (loss=2.00082):  14%|| 22/157 [00:07<00:42,  3.16it/s]

Validating... (loss=2.01814):  14%|| 22/157 [00:07<00:42,  3.16it/s]

Validating... (loss=2.01814):  15%|| 23/157 [00:07<00:42,  3.16it/s]

Validating... (loss=1.95876):  15%|| 23/157 [00:08<00:42,  3.16it/s]

Validating... (loss=1.95876):  15%|| 24/157 [00:08<00:42,  3.16it/s]

Validating... (loss=1.92874):  15%|| 24/157 [00:08<00:42,  3.16it/s]

Validating... (loss=1.92874):  16%|| 25/157 [00:08<00:41,  3.16it/s]

Validating... (loss=2.28159):  16%|| 25/157 [00:08<00:41,  3.16it/s]

Validating... (loss=2.28159):  17%|| 26/157 [00:08<00:41,  3.16it/s]

Validating... (loss=1.96498):  17%|| 26/157 [00:09<00:41,  3.16it/s]

Validating... (loss=1.96498):  17%|| 27/157 [00:09<00:41,  3.16it/s]

Validating... (loss=1.77184):  17%|| 27/157 [00:09<00:41,  3.16it/s]

Validating... (loss=1.77184):  18%|| 28/157 [00:09<00:40,  3.15it/s]

Validating... (loss=2.28150):  18%|| 28/157 [00:09<00:40,  3.15it/s]

Validating... (loss=2.28150):  18%|| 29/157 [00:09<00:40,  3.14it/s]

Validating... (loss=2.21101):  18%|| 29/157 [00:10<00:40,  3.14it/s]

Validating... (loss=2.21101):  19%|| 30/157 [00:10<00:40,  3.14it/s]

Validating... (loss=1.88715):  19%|| 30/157 [00:10<00:40,  3.14it/s]

Validating... (loss=1.88715):  20%|| 31/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.96697):  20%|| 31/157 [00:10<00:40,  3.15it/s]

Validating... (loss=1.96697):  20%|| 32/157 [00:10<00:39,  3.15it/s]

Validating... (loss=1.76300):  20%|| 32/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.76300):  21%|| 33/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.91684):  21%|| 33/157 [00:11<00:39,  3.15it/s]

Validating... (loss=1.91684):  22%|| 34/157 [00:11<00:38,  3.16it/s]

Validating... (loss=2.12606):  22%|| 34/157 [00:11<00:38,  3.16it/s]

Validating... (loss=2.12606):  22%|| 35/157 [00:11<00:38,  3.15it/s]

Validating... (loss=2.12374):  22%|| 35/157 [00:12<00:38,  3.15it/s]

Validating... (loss=2.12374):  23%|| 36/157 [00:12<00:38,  3.15it/s]

Validating... (loss=1.75787):  23%|| 36/157 [00:12<00:38,  3.15it/s]

Validating... (loss=1.75787):  24%|| 37/157 [00:12<00:38,  3.16it/s]

Validating... (loss=2.00468):  24%|| 37/157 [00:12<00:38,  3.16it/s]

Validating... (loss=2.00468):  24%|| 38/157 [00:12<00:37,  3.16it/s]

Validating... (loss=1.94146):  24%|| 38/157 [00:12<00:37,  3.16it/s]

Validating... (loss=1.94146):  25%|| 39/157 [00:12<00:37,  3.16it/s]

Validating... (loss=2.43544):  25%|| 39/157 [00:13<00:37,  3.16it/s]

Validating... (loss=2.43544):  25%|| 40/157 [00:13<00:37,  3.16it/s]

Validating... (loss=2.09586):  25%|| 40/157 [00:13<00:37,  3.16it/s]

Validating... (loss=2.09586):  26%|| 41/157 [00:13<00:36,  3.15it/s]

Validating... (loss=1.94713):  26%|| 41/157 [00:13<00:36,  3.15it/s]

Validating... (loss=1.94713):  27%|| 42/157 [00:13<00:36,  3.15it/s]

Validating... (loss=2.03265):  27%|| 42/157 [00:14<00:36,  3.15it/s]

Validating... (loss=2.03265):  27%|| 43/157 [00:14<00:36,  3.15it/s]

Validating... (loss=2.17617):  27%|| 43/157 [00:14<00:36,  3.15it/s]

Validating... (loss=2.17617):  28%|| 44/157 [00:14<00:35,  3.15it/s]

Validating... (loss=2.22399):  28%|| 44/157 [00:14<00:35,  3.15it/s]

Validating... (loss=2.22399):  29%|| 45/157 [00:14<00:35,  3.16it/s]

Validating... (loss=2.02408):  29%|| 45/157 [00:15<00:35,  3.16it/s]

Validating... (loss=2.02408):  29%|| 46/157 [00:15<00:35,  3.16it/s]

Validating... (loss=1.91383):  29%|| 46/157 [00:15<00:35,  3.16it/s]

Validating... (loss=1.91383):  30%|| 47/157 [00:15<00:34,  3.16it/s]

Validating... (loss=1.90844):  30%|| 47/157 [00:15<00:34,  3.16it/s]

Validating... (loss=1.90844):  31%|| 48/157 [00:15<00:34,  3.16it/s]

Validating... (loss=2.11842):  31%|| 48/157 [00:16<00:34,  3.16it/s]

Validating... (loss=2.11842):  31%|| 49/157 [00:16<00:34,  3.16it/s]

Validating... (loss=2.06954):  31%|| 49/157 [00:16<00:34,  3.16it/s]

Validating... (loss=2.06954):  32%|| 50/157 [00:16<00:33,  3.16it/s]

Validating... (loss=2.34096):  32%|| 50/157 [00:16<00:33,  3.16it/s]

Validating... (loss=2.34096):  32%|| 51/157 [00:16<00:33,  3.16it/s]

Validating... (loss=1.99471):  32%|| 51/157 [00:17<00:33,  3.16it/s]

Validating... (loss=1.99471):  33%|| 52/157 [00:17<00:33,  3.16it/s]

Validating... (loss=1.88802):  33%|| 52/157 [00:17<00:33,  3.16it/s]

Validating... (loss=1.88802):  34%|| 53/157 [00:17<00:32,  3.16it/s]

Validating... (loss=2.06518):  34%|| 53/157 [00:17<00:32,  3.16it/s]

Validating... (loss=2.06518):  34%|| 54/157 [00:17<00:32,  3.16it/s]

Validating... (loss=2.10336):  34%|| 54/157 [00:18<00:32,  3.16it/s]

Validating... (loss=2.10336):  35%|| 55/157 [00:18<00:32,  3.17it/s]

Validating... (loss=1.93895):  35%|| 55/157 [00:18<00:32,  3.17it/s]

Validating... (loss=1.93895):  36%|| 56/157 [00:18<00:31,  3.17it/s]

Validating... (loss=2.08553):  36%|| 56/157 [00:18<00:31,  3.17it/s]

Validating... (loss=2.08553):  36%|| 57/157 [00:18<00:31,  3.16it/s]

Validating... (loss=2.02458):  36%|| 57/157 [00:18<00:31,  3.16it/s]

Validating... (loss=2.02458):  37%|| 58/157 [00:18<00:31,  3.16it/s]

Validating... (loss=1.99208):  37%|| 58/157 [00:19<00:31,  3.16it/s]

Validating... (loss=1.99208):  38%|| 59/157 [00:19<00:30,  3.16it/s]

Validating... (loss=1.97667):  38%|| 59/157 [00:19<00:30,  3.16it/s]

Validating... (loss=1.97667):  38%|| 60/157 [00:19<00:30,  3.16it/s]

Validating... (loss=2.08530):  38%|| 60/157 [00:19<00:30,  3.16it/s]

Validating... (loss=2.08530):  39%|| 61/157 [00:19<00:30,  3.16it/s]

Validating... (loss=2.02745):  39%|| 61/157 [00:20<00:30,  3.16it/s]

Validating... (loss=2.02745):  39%|| 62/157 [00:20<00:30,  3.17it/s]

Validating... (loss=2.10595):  39%|| 62/157 [00:20<00:30,  3.17it/s]

Validating... (loss=2.10595):  40%|| 63/157 [00:20<00:29,  3.17it/s]

Validating... (loss=1.96789):  40%|| 63/157 [00:20<00:29,  3.17it/s]

Validating... (loss=1.96789):  41%|| 64/157 [00:20<00:29,  3.16it/s]

Validating... (loss=2.09850):  41%|| 64/157 [00:21<00:29,  3.16it/s]

Validating... (loss=2.09850):  41%|| 65/157 [00:21<00:29,  3.16it/s]

Validating... (loss=1.92499):  41%|| 65/157 [00:21<00:29,  3.16it/s]

Validating... (loss=1.92499):  42%|| 66/157 [00:21<00:28,  3.16it/s]

Validating... (loss=2.02570):  42%|| 66/157 [00:21<00:28,  3.16it/s]

Validating... (loss=2.02570):  43%|| 67/157 [00:21<00:28,  3.16it/s]

Validating... (loss=2.12582):  43%|| 67/157 [00:22<00:28,  3.16it/s]

Validating... (loss=2.12582):  43%|| 68/157 [00:22<00:28,  3.16it/s]

Validating... (loss=1.96861):  43%|| 68/157 [00:22<00:28,  3.16it/s]

Validating... (loss=1.96861):  44%|| 69/157 [00:22<00:27,  3.16it/s]

Validating... (loss=1.85099):  44%|| 69/157 [00:22<00:27,  3.16it/s]

Validating... (loss=1.85099):  45%|| 70/157 [00:22<00:27,  3.16it/s]

Validating... (loss=2.22435):  45%|| 70/157 [00:23<00:27,  3.16it/s]

Validating... (loss=2.22435):  45%|| 71/157 [00:23<00:27,  3.16it/s]

Validating... (loss=2.19747):  45%|| 71/157 [00:23<00:27,  3.16it/s]

Validating... (loss=2.19747):  46%|| 72/157 [00:23<00:26,  3.16it/s]

Validating... (loss=1.90643):  46%|| 72/157 [00:23<00:26,  3.16it/s]

Validating... (loss=1.90643):  46%|| 73/157 [00:23<00:26,  3.16it/s]

Validating... (loss=2.09585):  46%|| 73/157 [00:24<00:26,  3.16it/s]

Validating... (loss=2.09585):  47%|| 74/157 [00:24<00:26,  3.16it/s]

Validating... (loss=2.23154):  47%|| 74/157 [00:24<00:26,  3.16it/s]

Validating... (loss=2.23154):  48%|| 75/157 [00:24<00:25,  3.16it/s]

Validating... (loss=1.76224):  48%|| 75/157 [00:24<00:25,  3.16it/s]

Validating... (loss=1.76224):  48%|| 76/157 [00:24<00:25,  3.16it/s]

Validating... (loss=1.95350):  48%|| 76/157 [00:25<00:25,  3.16it/s]

Validating... (loss=1.95350):  49%|| 77/157 [00:25<00:25,  3.16it/s]

Validating... (loss=1.93626):  49%|| 77/157 [00:25<00:25,  3.16it/s]

Validating... (loss=1.93626):  50%|| 78/157 [00:25<00:25,  3.16it/s]

Validating... (loss=1.92634):  50%|| 78/157 [00:25<00:25,  3.16it/s]

Validating... (loss=1.92634):  50%|| 79/157 [00:25<00:24,  3.16it/s]

Validating... (loss=1.95208):  50%|| 79/157 [00:25<00:24,  3.16it/s]

Validating... (loss=1.95208):  51%|| 80/157 [00:25<00:24,  3.16it/s]

Validating... (loss=2.06398):  51%|| 80/157 [00:26<00:24,  3.16it/s]

Validating... (loss=2.06398):  52%|| 81/157 [00:26<00:24,  3.16it/s]

Validating... (loss=2.13283):  52%|| 81/157 [00:26<00:24,  3.16it/s]

Validating... (loss=2.13283):  52%|| 82/157 [00:26<00:23,  3.16it/s]

Validating... (loss=1.91627):  52%|| 82/157 [00:26<00:23,  3.16it/s]

Validating... (loss=1.91627):  53%|| 83/157 [00:26<00:23,  3.16it/s]

Validating... (loss=2.05603):  53%|| 83/157 [00:27<00:23,  3.16it/s]

Validating... (loss=2.05603):  54%|| 84/157 [00:27<00:23,  3.16it/s]

Validating... (loss=2.08147):  54%|| 84/157 [00:27<00:23,  3.16it/s]

Validating... (loss=2.08147):  54%|| 85/157 [00:27<00:22,  3.16it/s]

Validating... (loss=2.01358):  54%|| 85/157 [00:27<00:22,  3.16it/s]

Validating... (loss=2.01358):  55%|| 86/157 [00:27<00:22,  3.16it/s]

Validating... (loss=2.19508):  55%|| 86/157 [00:28<00:22,  3.16it/s]

Validating... (loss=2.19508):  55%|| 87/157 [00:28<00:22,  3.16it/s]

Validating... (loss=2.19523):  55%|| 87/157 [00:28<00:22,  3.16it/s]

Validating... (loss=2.19523):  56%|| 88/157 [00:28<00:21,  3.16it/s]

Validating... (loss=2.01230):  56%|| 88/157 [00:28<00:21,  3.16it/s]

Validating... (loss=2.01230):  57%|| 89/157 [00:28<00:21,  3.16it/s]

Validating... (loss=1.72561):  57%|| 89/157 [00:29<00:21,  3.16it/s]

Validating... (loss=1.72561):  57%|| 90/157 [00:29<00:21,  3.16it/s]

Validating... (loss=1.80990):  57%|| 90/157 [00:29<00:21,  3.16it/s]

Validating... (loss=1.80990):  58%|| 91/157 [00:29<00:20,  3.16it/s]

Validating... (loss=2.05617):  58%|| 91/157 [00:29<00:20,  3.16it/s]

Validating... (loss=2.05617):  59%|| 92/157 [00:29<00:20,  3.16it/s]

Validating... (loss=2.22789):  59%|| 92/157 [00:30<00:20,  3.16it/s]

Validating... (loss=2.22789):  59%|| 93/157 [00:30<00:20,  3.16it/s]

Validating... (loss=2.32522):  59%|| 93/157 [00:30<00:20,  3.16it/s]

Validating... (loss=2.32522):  60%|| 94/157 [00:30<00:19,  3.16it/s]

Validating... (loss=1.96062):  60%|| 94/157 [00:30<00:19,  3.16it/s]

Validating... (loss=1.96062):  61%|| 95/157 [00:30<00:19,  3.16it/s]

Validating... (loss=2.09586):  61%|| 95/157 [00:31<00:19,  3.16it/s]

Validating... (loss=2.09586):  61%|| 96/157 [00:31<00:19,  3.16it/s]

Validating... (loss=2.21566):  61%|| 96/157 [00:31<00:19,  3.16it/s]

Validating... (loss=2.21566):  62%|| 97/157 [00:31<00:19,  3.15it/s]

Validating... (loss=1.98515):  62%|| 97/157 [00:31<00:19,  3.15it/s]

Validating... (loss=1.98515):  62%|| 98/157 [00:31<00:18,  3.16it/s]

Validating... (loss=1.91807):  62%|| 98/157 [00:31<00:18,  3.16it/s]

Validating... (loss=1.91807):  63%|| 99/157 [00:31<00:18,  3.16it/s]

Validating... (loss=2.02795):  63%|| 99/157 [00:32<00:18,  3.16it/s]

Validating... (loss=2.02795):  64%|| 100/157 [00:32<00:18,  3.16it/s]

Validating... (loss=2.11769):  64%|| 100/157 [00:32<00:18,  3.16it/s]

Validating... (loss=2.11769):  64%|| 101/157 [00:32<00:17,  3.16it/s]

Validating... (loss=2.09633):  64%|| 101/157 [00:32<00:17,  3.16it/s]

Validating... (loss=2.09633):  65%|| 102/157 [00:32<00:17,  3.16it/s]

Validating... (loss=2.08100):  65%|| 102/157 [00:33<00:17,  3.16it/s]

Validating... (loss=2.08100):  66%|| 103/157 [00:33<00:17,  3.16it/s]

Validating... (loss=1.90934):  66%|| 103/157 [00:33<00:17,  3.16it/s]

Validating... (loss=1.90934):  66%|| 104/157 [00:33<00:16,  3.16it/s]

Validating... (loss=2.02645):  66%|| 104/157 [00:33<00:16,  3.16it/s]

Validating... (loss=2.02645):  67%|| 105/157 [00:33<00:16,  3.16it/s]

Validating... (loss=1.99180):  67%|| 105/157 [00:34<00:16,  3.16it/s]

Validating... (loss=1.99180):  68%|| 106/157 [00:34<00:16,  3.16it/s]

Validating... (loss=1.90813):  68%|| 106/157 [00:34<00:16,  3.16it/s]

Validating... (loss=1.90813):  68%|| 107/157 [00:34<00:15,  3.16it/s]

Validating... (loss=2.05828):  68%|| 107/157 [00:34<00:15,  3.16it/s]

Validating... (loss=2.05828):  69%|| 108/157 [00:34<00:15,  3.16it/s]

Validating... (loss=2.29882):  69%|| 108/157 [00:35<00:15,  3.16it/s]

Validating... (loss=2.29882):  69%|| 109/157 [00:35<00:15,  3.16it/s]

Validating... (loss=2.05478):  69%|| 109/157 [00:35<00:15,  3.16it/s]

Validating... (loss=2.05478):  70%|| 110/157 [00:35<00:14,  3.16it/s]

Validating... (loss=2.07163):  70%|| 110/157 [00:35<00:14,  3.16it/s]

Validating... (loss=2.07163):  71%|| 111/157 [00:35<00:14,  3.16it/s]

Validating... (loss=1.82204):  71%|| 111/157 [00:36<00:14,  3.16it/s]

Validating... (loss=1.82204):  71%|| 112/157 [00:36<00:14,  3.16it/s]

Validating... (loss=2.07699):  71%|| 112/157 [00:36<00:14,  3.16it/s]

Validating... (loss=2.07699):  72%|| 113/157 [00:36<00:13,  3.16it/s]

Validating... (loss=1.86269):  72%|| 113/157 [00:36<00:13,  3.16it/s]

Validating... (loss=1.86269):  73%|| 114/157 [00:36<00:13,  3.16it/s]

Validating... (loss=2.04794):  73%|| 114/157 [00:37<00:13,  3.16it/s]

Validating... (loss=2.04794):  73%|| 115/157 [00:37<00:13,  3.16it/s]

Validating... (loss=2.20671):  73%|| 115/157 [00:37<00:13,  3.16it/s]

Validating... (loss=2.20671):  74%|| 116/157 [00:37<00:13,  3.15it/s]

Validating... (loss=1.85939):  74%|| 116/157 [00:37<00:13,  3.15it/s]

Validating... (loss=1.85939):  75%|| 117/157 [00:37<00:12,  3.15it/s]

Validating... (loss=2.10581):  75%|| 117/157 [00:37<00:12,  3.15it/s]

Validating... (loss=2.10581):  75%|| 118/157 [00:37<00:12,  3.15it/s]

Validating... (loss=1.90403):  75%|| 118/157 [00:38<00:12,  3.15it/s]

Validating... (loss=1.90403):  76%|| 119/157 [00:38<00:12,  3.16it/s]

Validating... (loss=1.93327):  76%|| 119/157 [00:38<00:12,  3.16it/s]

Validating... (loss=1.93327):  76%|| 120/157 [00:38<00:11,  3.16it/s]

Validating... (loss=1.86604):  76%|| 120/157 [00:38<00:11,  3.16it/s]

Validating... (loss=1.86604):  77%|| 121/157 [00:38<00:11,  3.16it/s]

Validating... (loss=2.03043):  77%|| 121/157 [00:39<00:11,  3.16it/s]

Validating... (loss=2.03043):  78%|| 122/157 [00:39<00:11,  3.15it/s]

Validating... (loss=1.89361):  78%|| 122/157 [00:39<00:11,  3.15it/s]

Validating... (loss=1.89361):  78%|| 123/157 [00:39<00:10,  3.14it/s]

Validating... (loss=1.90273):  78%|| 123/157 [00:39<00:10,  3.14it/s]

Validating... (loss=1.90273):  79%|| 124/157 [00:39<00:10,  3.15it/s]

Validating... (loss=2.02662):  79%|| 124/157 [00:40<00:10,  3.15it/s]

Validating... (loss=2.02662):  80%|| 125/157 [00:40<00:10,  3.15it/s]

Validating... (loss=1.98755):  80%|| 125/157 [00:40<00:10,  3.15it/s]

Validating... (loss=1.98755):  80%|| 126/157 [00:40<00:09,  3.15it/s]

Validating... (loss=2.20737):  80%|| 126/157 [00:40<00:09,  3.15it/s]

Validating... (loss=2.20737):  81%|| 127/157 [00:40<00:09,  3.16it/s]

Validating... (loss=2.16157):  81%|| 127/157 [00:41<00:09,  3.16it/s]

Validating... (loss=2.16157):  82%|| 128/157 [00:41<00:09,  3.16it/s]

Validating... (loss=2.06682):  82%|| 128/157 [00:41<00:09,  3.16it/s]

Validating... (loss=2.06682):  82%|| 129/157 [00:41<00:08,  3.16it/s]

Validating... (loss=2.33536):  82%|| 129/157 [00:41<00:08,  3.16it/s]

Validating... (loss=2.33536):  83%|| 130/157 [00:41<00:08,  3.16it/s]

Validating... (loss=2.11653):  83%|| 130/157 [00:42<00:08,  3.16it/s]

Validating... (loss=2.11653):  83%|| 131/157 [00:42<00:08,  3.16it/s]

Validating... (loss=2.01537):  83%|| 131/157 [00:42<00:08,  3.16it/s]

Validating... (loss=2.01537):  84%|| 132/157 [00:42<00:07,  3.16it/s]

Validating... (loss=2.14983):  84%|| 132/157 [00:42<00:07,  3.16it/s]

Validating... (loss=2.14983):  85%|| 133/157 [00:42<00:07,  3.16it/s]

Validating... (loss=2.14365):  85%|| 133/157 [00:43<00:07,  3.16it/s]

Validating... (loss=2.14365):  85%|| 134/157 [00:43<00:07,  3.16it/s]

Validating... (loss=1.98443):  85%|| 134/157 [00:43<00:07,  3.16it/s]

Validating... (loss=1.98443):  86%|| 135/157 [00:43<00:06,  3.15it/s]

Validating... (loss=2.10730):  86%|| 135/157 [00:43<00:06,  3.15it/s]

Validating... (loss=2.10730):  87%|| 136/157 [00:43<00:06,  3.16it/s]

Validating... (loss=1.96340):  87%|| 136/157 [00:44<00:06,  3.16it/s]

Validating... (loss=1.96340):  87%|| 137/157 [00:44<00:06,  3.16it/s]

Validating... (loss=2.11513):  87%|| 137/157 [00:44<00:06,  3.16it/s]

Validating... (loss=2.11513):  88%|| 138/157 [00:44<00:06,  3.16it/s]

Validating... (loss=2.29271):  88%|| 138/157 [00:44<00:06,  3.16it/s]

Validating... (loss=2.29271):  89%|| 139/157 [00:44<00:05,  3.16it/s]

Validating... (loss=1.92066):  89%|| 139/157 [00:44<00:05,  3.16it/s]

Validating... (loss=1.92066):  89%|| 140/157 [00:44<00:05,  3.16it/s]

Validating... (loss=1.93768):  89%|| 140/157 [00:45<00:05,  3.16it/s]

Validating... (loss=1.93768):  90%|| 141/157 [00:45<00:05,  3.16it/s]

Validating... (loss=2.00358):  90%|| 141/157 [00:45<00:05,  3.16it/s]

Validating... (loss=2.00358):  90%|| 142/157 [00:45<00:04,  3.16it/s]

Validating... (loss=2.17332):  90%|| 142/157 [00:45<00:04,  3.16it/s]

Validating... (loss=2.17332):  91%|| 143/157 [00:45<00:04,  3.14it/s]

Validating... (loss=1.90570):  91%|| 143/157 [00:46<00:04,  3.14it/s]

Validating... (loss=1.90570):  92%|| 144/157 [00:46<00:04,  3.13it/s]

Validating... (loss=1.92257):  92%|| 144/157 [00:46<00:04,  3.13it/s]

Validating... (loss=1.92257):  92%|| 145/157 [00:46<00:03,  3.14it/s]

Validating... (loss=2.26820):  92%|| 145/157 [00:46<00:03,  3.14it/s]

Validating... (loss=2.26820):  93%|| 146/157 [00:46<00:03,  3.14it/s]

Validating... (loss=2.18221):  93%|| 146/157 [00:47<00:03,  3.14it/s]

Validating... (loss=2.18221):  94%|| 147/157 [00:47<00:03,  3.15it/s]

Validating... (loss=1.92792):  94%|| 147/157 [00:47<00:03,  3.15it/s]

Validating... (loss=1.92792):  94%|| 148/157 [00:47<00:02,  3.15it/s]

Validating... (loss=2.03305):  94%|| 148/157 [00:47<00:02,  3.15it/s]

Validating... (loss=2.03305):  95%|| 149/157 [00:47<00:02,  3.15it/s]

Validating... (loss=1.91824):  95%|| 149/157 [00:48<00:02,  3.15it/s]

Validating... (loss=1.91824):  96%|| 150/157 [00:48<00:02,  3.16it/s]

Validating... (loss=1.85552):  96%|| 150/157 [00:48<00:02,  3.16it/s]

Validating... (loss=1.85552):  96%|| 151/157 [00:48<00:01,  3.16it/s]

Validating... (loss=2.05134):  96%|| 151/157 [00:48<00:01,  3.16it/s]

Validating... (loss=2.05134):  97%|| 152/157 [00:48<00:01,  3.16it/s]

Validating... (loss=2.11272):  97%|| 152/157 [00:49<00:01,  3.16it/s]

Validating... (loss=2.11272):  97%|| 153/157 [00:49<00:01,  3.16it/s]

Validating... (loss=1.90253):  97%|| 153/157 [00:49<00:01,  3.16it/s]

Validating... (loss=1.90253):  98%|| 154/157 [00:49<00:00,  3.16it/s]

Validating... (loss=2.14570):  98%|| 154/157 [00:49<00:00,  3.16it/s]

Validating... (loss=2.14570):  99%|| 155/157 [00:49<00:00,  3.16it/s]

Validating... (loss=1.95922):  99%|| 155/157 [00:50<00:00,  3.16it/s]

Validating... (loss=1.95922):  99%|| 156/157 [00:50<00:00,  3.16it/s]

Validating... (loss=1.82598):  99%|| 156/157 [00:50<00:00,  3.16it/s]
Validating... (loss=1.82598): 100%|| 157/157 [00:50<00:00,  3.12it/s]
10/17/2022 06:33:12 - INFO - __main__ - 

10/17/2022 06:33:12 - INFO - __main__ - Validation Results
10/17/2022 06:33:12 - INFO - __main__ - Global Steps: 500
10/17/2022 06:33:12 - INFO - __main__ - Valid Loss: 2.03673
10/17/2022 06:33:12 - INFO - __main__ - Valid Accuracy: 0.26590

Training (500 / 500 Steps) (loss=2.05132):  64%|| 499/782 [13:13<07:29,  1.59s/it]
10/17/2022 06:33:12 - INFO - __main__ - Best Accuracy: 	0.295600
10/17/2022 06:33:12 - INFO - __main__ - End Training!
I1017 06:33:12.743691 13430 ProcessGroupNCCL.cpp:603] [Rank 0] NCCL watchdog thread terminated normally
/usr/local/lib/python3.7/site-packages/torch/distributed/launch.py:186: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See 
https://pytorch.org/docs/stable/distributed.html#launch-utility for 
further instructions

  FutureWarning,